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PREFACE 


This book was written with the intention of providing an introductory textbook, with 
the emphasis on general principles rather than on practical applications. -I tried to 
make die book useful to as wide a range of readers as possible, particularly biologists 
who, like myself, have no more than ordinary mathematical ability. The mathematics 
does not go beyond simple algebra; neither calculus nor matrix methods are used. 
Some knowledge of statistics, however, is assumed, particularly of the analysis of 
variance and of correlation and regression. 

The second edition kept the same structure but was somewhat enlarged by the 
inclusion of developments in the intervening twenty years, and by more attention 
being given to plants. In consequence the book came to contain a good deal more 
material than is needed by those for whom the subject is part of a course on general 
genetics. The section headings, however, should facilitate the selection of what is 
relevant. My main regret then, as it is now, was the impossibility of mentioning 

more than a very few of the experimental studies that have illuminated the subject 
since the book first appeared. 

The revisions made in this new edition are less extensive. The desire not to in¬ 
crease the length of the book has meant that many of the recent developments are 
noted by little more than references to the sources. The demonstration that mutation 
is not negligible for quantitative genetics has, however, necessitated more substan¬ 
tial revision of Chapter 12 and to a lesser extent Chapters 15 and 20. 

The Problems, which were hitherto published separately, are now put together 
with the text, following the chapters to which they refer. They are of varying dif¬ 
ficulty and I hope that all students will find some that they can solve immediately 
and some also that will tax their ingenuity to the full. Some of the problems are 
based on the data and solutions of earlier ones. Students are therefore advised to 
keep their workings for later use; this will save the repetition of calculations I have 
based the problems on real data wherever I could, to make them more interesting 
and realistic. In consequence, however, the arithmetic seldom works out simply 
and a pocket calculator will be needed for most of them. A few of the problems 
have been revised for this edition. The solutions are at the end of the book, arranged 
in a different order from the problems so as to avoid the risk of inadvertently seeing 
die solution of the next problem. The solutions are not simply answers but give fairly 
full explanations of how the problems are solved. 
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INTRODUCTION 


Quantitative genetics is concerned with the inheritance of those differences between 
individuals that are of degree rather than of kind, quantitative rather than qualitative. 
These are the individual differences which, as Darwin wrote, ‘afford materials for 
natural selection to act on and accumulate, in the same manner as man accumulates 
in any given direction individual differences in his domestic productions’. An 
understanding of the inheritance of these differences is thus of fundamental 
significance in the study of evolution and in the application of genetics to animal 
and plant breeding; and it is from these two fields of enquiry that the subject has 
received the chief impetus to its growth. 

Virtually every organ and function of any species shows individual differences 
of this nature, the differences of size among ourselves or our domestic animals being 
an example familiar to all. Individuals form a continuously graded series from one 
extreme to the other and do not fall naturally into sharply demarcated types. 
Qualitative differences, in contrast, divide individuals into distinct types with little 
or no connexion by intermediates. Examples are the differences between blue-eyed 
and brown-eyed individuals, between the blood groups, or between normally coloured 
and albino individuals. The familiar Mendelian ratios, which display the mechanism 
of inheritance, can be seen only when a gene difference at a single locus gives rise 
to a readily detectable difference in some such property of the organism. Quantitative 
differences, in so far as they are inherited, depend on genes whose effects are small 
in relation to the variation arising from other causes. Furthermore, quantitative dif¬ 
ferences are usually, though not necessarily always, influenced by gene differences 
at many loci. Consequently the individual genes, whether few or many, cannot be 
identified by their segregation; the Mendelian ratios are not displayed, and the 
methods of Mendelian analysis cannot be applied. 

It is, nevertheless, a basic premiss of quantitative genetics that the inheritance 
of quantitative differences depends on genes subject to the same laws of transmis¬ 
sion and having the same general properties as the genes whose transmission and 
properties are displayed by qualitative differences. Quantitative genetics is therefore 
an extension of Mendelian genetics, resting squarely on Mendelian principles as its 
foundation. 

The methods of study in quantitative genetics differ from those employed in 
Mendelian genetics in two respects. In the first place, since ratios cannot be observed, 
single progenies are uninformative, and the unit of study must be extended to ‘popula¬ 
tions’, that is, larger groups of individuals comprising many progenies. And, in the 
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second place, the nature of the quantitative differences to be studied requires the 
measurement, and not just the classification, of the individuals. The extension of 
Mendelian genetics into quantitative genetics may thus be made in two stages, the 
first introducing new concepts connected with the genetic properties of ‘populations’ 
and the second introducing concepts connected with the inheritance of measurements. 
This is how the subject is presented in this book. In the first part, which occupies 
Chapters 1 to 5, the genetic properties of populations are described by reference 
to genes causing easily identifiable, and therefore qualitative, differences. Quan¬ 
titative differences are not discussed until the second part, which starts in Chapter 
6. These two parts of the subject are often distinguished by different names, the 
first being referred to as ‘population genetics’ and the second as ‘quantitative genetics’ 
or ‘biometrical genetics’. 

The theoretical basis of quantitative genetics was established round about 1920 
by the work of Fisher (1918), Haldane (summarized 1932) and Wright (1921). The 
development of the subject over the succeeding years, by these and many other 
geneticists and statisticians, has been mainly by elaboration, clarification, and the 
filling in of details, so that today we have a substantial body of theory accepted by 
the majority as valid. 

The theory consists of the deduction of the consequences of Mendelian inheritance 
when extended to the properties of populations and to the simultaneous segregation 
of genes at many loci. The premiss from which the deductions are made is that the 
inheritance of quantitative differences is by means of genes, and that these genes 
are subject to the Mendelian laws of transmission and may have any of the proper¬ 
ties known from Mendelian genetics. The property of ‘variable expression’ assumes 
great importance and might be raised to the status of another premiss: that the expres¬ 
sion of the genotype in the phenotype is modifiable by non-genetic causes. Other 
properties whose consequences are taken into account include dominance, epistasis, 
pleiotropy, linkage, and mutation. The theory then allows us to deduce what will 
be the genetic properties of a population if the genes have the properties postulated. 
It allows us also to predict the consequences of any specified breeding plan, in¬ 
cluding those of natural selection. It therefore forms the basis for understanding evolu¬ 
tionary change. The main practical use of the theory is in comparing the merits of 
alternative procedures for animal and plant improvement. 

The experimental side of quantitative genetics has three roles, complementary to 
the theoretical side. First, experimental study of populations allows us to deduce 
the properties of the genes associated with quantitative variation. Second, experi¬ 
mental breeding allows us to test the validity of the theory. And third, there are 
some consequences of breeding procedures that cannot be predicted from the theory, 
and questions about these can be answered only by experiment. There is now a large 
body of experimental data which substantiates the theory in considerable detail, show¬ 
ing that the genes concerned with quantitative variation do have the properties known 
from Mendelian genetics, and that the outcome of most breeding procedures can 
be predicted with some confidence. The aim is to describe all that is reasonably firmly 
established and, for the sake of clarity, to simplify as far as is possible without be¬ 
ing misleading. Consequently, the emphasis is on the theoretical side. Though con¬ 
clusions will often be drawn directly from experimental data, the experimental side 
of the subject is presented chiefly in the form of examples, chosen with the purpose 
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of illustrating the theoretical conclusions. These examples, however, cannot always 
be taken as substantiating the postulates that underlie the conclusions they illustrate. 
Too often the results of experiments are open to more than one interpretation. The 
experimental work mentioned is only a very small, and far from random, sample 
of what has been done. In particular, a great deal more experimentation has been 
done with plants and farm animals than would appear from its representation among 
the work cited. 

No attempt has been made to give exhaustive references to published work in any 
part of the subject; or to indicate the origins, or trace the history of the ideas. To 
have done this would have required a much longer book, and a considerable sacrifice 
of clarity. Most of the material in the book is covered more fully in one or other 
of the sources listed below. These sources are not regularly cited in the text. 
References are given in the text when any conclusion is stated without full explana¬ 
tion of its derivation. These references are not always to the original papers, but 
rather to the more recent papers where the reader will find a convenient point of 
entry to the topic under discussion. A selection of the original papers that have most 
influenced the development of the subject is reprinted with extensive commentaries 
by Hill (1984) in the Benchmark Papers in Genetics series (Vol. 15). 

Chief sources 

(For full bibliographical details see list of References) 

Becker (1984) Manual of Quantitative Genetics. 

Bulmer (1985) The Mathematical Theory of Quantitative Genetics. 

Crow (1986) Basic Concepts in Population, Quantitative, and Evolutionary Genetics. 
Crow and Kimura (1970) An Introduction to Population Genetics Theory. 

Hartl (1980) Principles of Population Genetics. 

Hedrick (1985) Genetics of Populations. 

Jacquard (1974) The Genetic Structure of Populations. 

Kempthorne (1957) An Introduction to Genetic Statistics. 

Li (1976) First Course in Population Genetics. 

Mather and Jinks (1977) Introduction to Biometrical Genetics. 

(1982) Biometrical Genetics. 

Mayo (1987) The Theory of Plant Breeding. 

Wright (1968—78) Evolution and the Genetics of Populations, Vols 1—4. 



| GENETIC CONSTITUTION OF A POPULATION 


Frequencies of genes and genotypes 

To describe the genetic constitution of a group of individuals we should have to 
specify their genotypes and say how many of each genotype there were. This would 
be a complete description, provided the nature of the phenotypic differences between 
the genotypes did not concern us. Suppose for simplicity that we were concerned 
with a certain autosomal locus. A, and that two different alleles at this locus, A, 
and A 2 were present among the individuals. Then there would be three possible 
genotypes, AjA], AjA^ and A 2 A 2 . (We are concerned here, as throughout the 
book, exclusively with diploid organisms.) The genetic constitution of the group 
would be fully described by the proportion, or percentage, of individuals that belonged 
to each genotype, or in other words by the frequencies of the three genotypes among 
the individuals. These proportions or frequencies are called genotype frequencies, 
the frequency of a particular genotype being its proportion or percentage among 
the individuals. If, for example, we found one-quarter of the individuals in the group 
to be AiA 1? the frequency of this genotype would be 0.25, or 25 per cent. 
Naturally, the frequencies of all the genotypes together must add up to unity, or 
100 per cent. 


Example 1.1 The M-N blood groups in man are determined by two alleles at a locus, 
and the three genotypes correspond with the three blood groups, M, MN, and N. The 
following figures, taken from the tabulation of Mourant (1954), show the blood group 
frequencies among Eskimos of East Greenland and among Icelanders as follows: 



Blood 

M 

group 

MN 

N 

Number of 
individuals 

Frequency, % Greenland 

83.5 

15.6 

0.9 

569 

Iceland 

31.2 

51.5 

17.3 

747 


Clearly the two populations differ in these genotype frequencies, the N blood group be¬ 
ing rare in Greenland and relatively common in Iceland. Not only is this locus a source 
of variation within each of the two populations, but it is also a source of genetic dif¬ 
ference between the populations. 
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A population, in the genetic sense, is not just a group of individuals, but a breeding 
group; and the genetics of a population is concerned not only with the genetic con¬ 
stitution of the individuals but also with the transmission of the genes from one genera¬ 
tion to the next. In the transmission the genotypes of the parents are broken down 
and a new set of genotypes is constituted in the progeny, from the genes transmitted 
in the gametes. The genes carried by the population thus have continuity from genera¬ 
tion to generation, but the genotypes in which they appear do not. The genetic con¬ 
stitution of a population, referring to the genes it carries, is described by the array 
of gene frequencies ; that is, by specification of the alleles present at every locus 
and the numbers or proportions of the different alleles at each locus. If, for exam¬ 
ple, A] is an allele at the A locus, then the frequency of Aj genes, or the gene fre¬ 
quency of A!, is the proportion or percentage of all genes at this locus that are the 
A] allele. The frequencies of all the alleles at any one locus must add up to unity, 
or 100 per cent. 

The gene frequencies at a particular locus among a group of individuals can be 
determined from a knowledge of the genotype frequencies. To take a hypothetical 
example, suppose there are two alleles, Aj and A 2 , and we classify 100 individuals 
and count the numbers in each genotype as follows: 




A[A] 

a,a 2 

A 2 A 2 

Total 


Number of individuals 


30 

60 

10 

100 


Number of genes 

{ A, 

1 a 2 

60 

0 

60 

60 

0 

20 

120 1 
80 J 

200 


Each individual contains two genes, so we have counted 200 representatives of the 
genes at this locus. Each A^ individual contains two Ai genes and each AjA 2 con¬ 
tains one Ax gene. So there are 120 A! genes in the sample, and 80 A 2 genes. The 
frequency of A t is therefore 60 per cent or 0.6, and the frequency of A 2 is 40 per 
cent or 0.4. To express the relationship in a more general form, let the frequencies 
of genes and of genotypes be as follows: 


Genes 

Genotypes 


A, A 2 

AjA| A] A 2 

A 2 A 2 

Frequencies p q 

P H 

Q 


so that p + q = 1 and P + H + Q = 1. Since each individual contains two genes, 
the frequency of A] genes is k (2 P + H ), and the relationship between gene 
frequency and genotype frequency among the individuals counted is as follows: 


p = P + hH 
q = Q + kH 


... [ 1 . 1 ] 


Example 1.2 To illustrate the calculation of gene frequencies from genotype frequen¬ 
cies we may take the M-N blood group frequencies given in Example 1.1. The M and 
N blood groups represent the two homozygous genotypes and the MN group the 
heterozygote. The frequency of the M gene in Greenland is, from equation [1.1] 
0.835 +i(0.156) =0.913, and the frequency of the N gene is 0.009+1(0.156)=0.087, 



6 


1 Genetic constitution of a population 


the sum of the frequencies being 1.000 as it should be. Doing the same for the Iceland 
sample, we find the following gene frequencies in the two populations, expressed now 
as percentages: 



Gene 



M 

N 

Greenland 

91.3 

8.7 

Iceland 

57.0 

43.0 


Thus the two populations differ in gene frequency as well as in genotype frequencies. 
Causes of change 

The genetic properties of a population are influenced in the process of transmission 
of genes from one generation to the next by a number of agencies. These form the 
chief subject-matter of the next four chapters, but we may briefly review them here 
in order to have some idea of what factors are being left out of consideration in 
this chapter. The agencies through which the genetic properties of a population may 
be changed are these: 

Population size . The genes passed from one generation to the next are a sample 
of the genes in the parent generation. Therefore the gene frequencies are subject 
to sampling variation between successive generations, and the smaller the number 
of parents the greater is the sampling variation. The effects of sampling variation 
will be considered in Chapters 3—5, and meantime we shall exclude it from the discus¬ 
sion by supposing always that we are dealing with a ‘large population’, which means 
simply one in which sampling variation is so small as to be negligible. For practical 
purposes a Targe population’ is one in which the number of adult individuals is in 
the hundreds rather than in the tens. 

Differences of fertility and viability. Though we are not at present concerned with 
the phenotypic effects of the genes under discussion, we cannot ignore their effects 
on fertility and viability, because these influence the genetic constitution of the suc¬ 
ceeding generation. The different genotypes among the parents may have different 
fertilities, and if they do they will contribute unequally to the gametes out of which 
the next generation is formed. In this way the gene frequency may be changed in 
the transmission. Further, the genotypes among the newly formed zygotes may have 
different survival rates, and so the gene frequencies in the new generation may be 
changed by the time the individuals are adult and themselves become parents. These 
processes are called selection, and will be described in Chapter 2. Meanwhile we 
shall suppose they are not operating. Human blood-group genes may be taken for 
the purpose of illustration since the selective forces acting on them are probably 
not strong. Genes that produce a mutant phenotype which is abnormal in comparison 
with the wild-type are, in contrast, usually subject to much more selection. 

Migration and mutation. The gene frequencies in the population may also be 
changed by immigration of individuals from another population, and by gene muta¬ 
tion. These processes will be described in Chapter 2, and at this stage will also be 
supposed not to operate. 
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Mating system. The genotypes in the progeny are determined by the union of the 
gametes in pairs to form zygotes, and the union of gametes is influenced by the mating 
of the parents. So the genotype frequencies in the offspring generation are influ¬ 
enced by the genotypes of the pairs that mate in the parent generation. We shall 
first suppose that mating is at random with respect to the genotypes under discus¬ 
sion. Random mating , or panmixia, means that any individual has an equal chance 
of mating with any other individual in the population. The important points are that 
there should be no special tendency for mated individuals to be alike in genotype, 
or to be related to each other by ancestry. If a population covers a large geographic 
area, individuals inhabiting the same locality are more likely to mate than individuals 
inhabiting different localities, and so the mated pairs tend to be related by ancestry. 
A widely spread population is therefore likely to be subdivided into local groups 
and mating is random only within the groups. The properties of sub-divided popula¬ 
tions depend on the size of the local groups and will be described under the effects 
of population size in Chapters 3—5. 

Hardy—Weinberg equilibrium 

The Hardy—Weinberg law 

In a large random-mating population with no selection, mutation, or migration, the 
gene frequencies and the genotype frequencies are constant from generation to genera¬ 
tion; and, furthermore, there is a simple relationship between the gene frequencies 
and the genotype frequencies. These properties of a population are derived from 
a theorem, or principle, known as the Hardy—Weinberg law after Hardy and 
Weinberg, who independently demonstrated them in 1908. A population with cons¬ 
tant gene and genotype frequencies is said to be in Hardy— Weinberg equilibrium. 
The relationship between gene frequencies and genotype frequencies is of the greatest 
importance because many of the deductions about population genetics and quantitative 
genetics rest on it. The relationship is this: if the gene frequencies of two alleles 
among the parents are p and q, then the genotype frequencies among the progeny 
are p 2 , 2pq, and q 1 , thus: 


Genes in parents 

Genotypes in progeny 

A, A 2 

A|A[ A| A 2 A 2 A 2 

Frequencies p q 

P 2 2 pq q 2 


(The relationship above refers to autosomal genes; sex-linked genes are not quite 
so simple and will be explained later.) The conditions of random mating and no 
selection, required for the Hardy—Weinberg law to hold, refer only to the genotypes 
under consideration. There may be preferential mating with respect to other attributes, 
and genotypes of other loci may be subject to selection, without affecting the issue. 
Two additional conditions are that the genes segregate normally in gametogenesis 
and that the gene frequencies are the same in males and females. The reasons for 
these requirements will be seen in the proof. 

The proof of the Hardy—Weinberg law involves four steps, which are summarized 
in Table 1.1, with the conditions that must hold for the deduction at each step to 
be valid. The details of the four steps are as follows. 
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1. From gene frequency in parents to gene frequency in gametes. Let the parent 
generation have gene and genotype frequencies as follows: 


Genes 

Genotypes 


A t A 2 

AjAj AjA2 

A 2 A 2 

Frequencies p q 

P H 

Q 


Two types of gamete are produced, those bearing and those bearing A 2 . A^ 
individuals produce only Aj gametes. AjA 2 individuals, provided segregation is nor¬ 
mal, produce equal numbers of A] and A 2 gametes. Then, provided all genotypes 
are equally fertile, the frequency of Aj among all the gametes produced by the 
whole population is P +2 H, which by equation [1.1] is the gene frequency of A[ 
in the parents producing the gametes. Thus the gene frequency in the whole gametic 
output is the same as in the parents. This is step la in Table 1.1. Only some of 
the gametes form zygotes that will become individuals in the next generation. The 
gene frequency in the zygotes is unchanged provided the gametes carrying different 
alleles do not differ in their fertilizing capacity, and provided the zygotes formed 
represent a large sample of the parental gametes. This is step lb. 

2. From gene frequency in gametes to genotype frequencies in zygotes. Random 
mating between individuals is equivalent to random union among their gametes. The 
genotype frequencies among the zygotes (fertilized eggs) are then the products of 
the frequencies of the gametic types that unite to produce them. The genotype fre¬ 
quencies among the progeny produced by random mating can therefore be deter¬ 
mined simply by multiplying the frequencies of the gametic types produced by each 
sex of parents. Provided the gametic frequencies are the same in each sex, the zygotes 

Table 1.1 Steps of deduction in the proof of the Hardy—Weinberg law, and the 

conditions that must hold 


Step 


1 a 


lb 


Deduction from: to 
Gene frequency in parents 

Gene frequency in all gametes 


2 

3 

4 


L Gene frequency in gametes 
forming zygotes 

^ Genotype frequencies in zygotes 
^ Genotype frequencies in progeny 
Gene frequency in progeny 


Conditions 


(1) Normal gene segregation 

(2) Equal fertility of parents 

(3) Equal fertilizing capacity 

of gametes 

(4) Large population 


(5) Random mating 

(6) Equal gene frequencies in 

male and female parents 

(7) Equal viability 
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Table 1.2 


Female gametes and 
their frequencies 


A, A 2 



produced are as shown in Table 1.2. The union of Ai eggs with A 2 sperms need 
not be distinguished from that of A 2 eggs with Aj sperms; so the genotype frequen¬ 
cies of the zygotes are: 


Genotype 

A,Aj 

A]A2 A 2 A 2 

Frequency p 2 

2 pq q 2 

From zygotes to adults. 

The genotype frequencies in the zygotes deduced above 


are the Hardy—Weinberg frequencies, as stated in [1.2]. This is, however, not quite 
the end of the proof because the frequencies will not be observable unless the zygotes 
survive equally well, at least until they can be classified for genotype. This step 
may seem trivial, but it must be recognized if the effects of differential viability 
are to be understood. 

4. From genotype frequencies to gene frequency in progeny . This final step proves 
that gene frequency has not changed. Provided the different genotypes in the pro¬ 
geny survive equally well to adulthood when they can become parents, their 
frequencies will be as above. The gene frequency in the adult progeny can then be 
found by equation [1.1]. The frequency of A t is p 2 +k ( 2pq ) = p(p+q) = p, which 
is the same as in the parent generation. This proves the constancy of the gene 
frequency from one generation to the next. 

Two further aspects of the Hardy—Weinberg law can now be stated. First, since 
the gene frequencies are the same in parents and progeny, the relationship between 
gene frequencies and genotype frequencies in [1.2] applies to a single generation. 
Second, the genotype frequencies in the progeny depend only on the gene frequen¬ 
cies in the parents and not on the genotype frequencies. This can be seen from step 
1 above, where the frequencies of the gametic types were shown to be equal to the 
parental gene frequencies, no matter what the genotype frequencies are. Conse¬ 
quently, parents with any genotype frequencies, provided they mate at random and 
provided the gene frequency is the same in males and females, produce progeny 
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in Hardy—Weinberg proportions. If the gene frequency is known not to be the same 
in males and females, it is easy to deduce the genotype frequencies in the progeny 
by putting the appropriate gametic frequencies in Table 1.2 at step 2. The gene fre¬ 
quency of an autosomal gene becomes equal in the two sexes of the progeny, and 
a second generation of random mating produces Hardy—Weinberg genotype fre¬ 
quencies corresponding to the mean of the original gene frequencies. 

The relationship between gene frequencies and genotype frequencies in a popula¬ 
tion in Hardy—Weinberg equilibrium is shown in Fig. 1.1. The graphs of genotype 
frequencies show two important features. First the frequency of the heterozygotes 
cannot be greater than 50 per cent, and this maximum occurs when the gene fre¬ 
quencies are p = q = 0.5. Second, when the gene frequency of an allele is low, 
the rare allele occurs predominantly in heterozygotes and there are very few 
homozygotes. This has important consequences for the effectiveness of selection, 
as will be seen in the next chapter. 

Applications of the Hardy—Weinberg law 

There are three ways in which the Hardy—Weinberg law is particularly useful, which 
will now be illustrated. 



Gene frequency of A 2 

Fig. 1.1. Relationship between genotype frequencies and gene frequency for two alleles in a 
population in Hardy-Weinberg equilibrium. 
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Gene frequency of recessive allele. At the beginning of the chapter we saw, in equa¬ 
tion [1.1], how the gene frequencies among a group of individuals can be deter¬ 
mined from their genotype frequencies; but for this it was necessary to know the 
frequencies of all three genotypes. Consequently, the relationship in equation [1.1] 
cannot be applied to the case of a recessive allele, when the heterozygote is 
indistinguishable from the dominant homozygote. If the genotypes are in Hardy- 
Weinberg proportions, however, we do not need to know the frequencies of all three 
genotypes. Let a, for example, be a recessive gene with a frequency of q; then the 
frequency of aa homozygotes is q 2 , and the gene frequency is the square-root of 
the homozygote frequency. Example 1.3 illustrates the calculation. For this way 
of estimating the gene frequency to be a valid one, it is obviously essential that there 
should be no selective elimination of homozygotes before they are counted. It should 
be noted also that the estimation of gene frequency in this way is rather sensitive 
to the effects of non-random mating. 

Frequency of ‘carriers’. It is often of interest to know the frequency of 
heterozygotes, or ‘carriers’, of recessive abnormalities, and this can be calculated 
if the gene frequency is known. If Hardy-Weinberg equilibrium can be assumed, 
the frequency of heterozygotes among all individuals, including homozygotes, is 
given by 2q(l — q). It is, however, often more relevant to know the frequency 
among normal individuals, though this will not be very different if homozygotes 
are rare. The frequency of heterozygotes among normal individuals, denoted by H', 
is the ratio of genotype frequencies Aa/(AA+Aa), where a is the recessive allele. 
So, when q is the frequency of a , 

H' = - .~ q) - = 2q . .. [1.3] 

(1 - qf + 2qi\ - q) 1 + q 

Example 1.3 Phenylketonuria (PKU) is a human metabolic disease due to a single recessive 
gene. Homozygotes can be detected a few days after birth, and selective elimination before 
then will be assumed to be negligible. Tests of babies born in Birmingham, UK, over 
a 3-year period detected 5 cases in 55,715 babies (Raine et al., 1972). The frequency 
of homozygotes in the sample is 90 X 10 -6 or about 1/11,000. The Hardy-Weinberg 
frequency of homozygotes is q 2 , so the gene frequency is q = V(90 X 10 ~ 6 ) = 9.5 
x 10 -3 = 0.0095. 

The frequency of heterozygotes in the whole population is 2^(1 - q), and among 
normal individuals is 2qt{\ 4- q). Both work out to be 0.019, approximately. Thus about 
2 per cent of normal people, or 1 in 50, are carriers of PKU. It comes as a surprise 
to most people to discover how common heterozygotes of a rare recessive abnormality 
are. The point has already been noted as a conclusion drawn from Fig. 1.1. 

Test of Hardy—Weinberg equilibrium. If data are available for a locus where all 
the genotypes are recognizable, the observed frequencies of the genotypes can be 
tested for agreement with a population in Hardy-Weinberg equilibrium. According 
to the Hardy-Weinberg law, the genotype frequencies of progeny are determined 
by the gene frequency in their parents. If the population is in equilibrium, the gene 
frequency is the same in parents and progeny, so the gene frequency observed in 
the progeny can be used as if it were the parental gene frequency to calculate the 
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genotype frequencies expected by the Hardy-Weinberg law. The procedure is 
illustrated in Example 1.4. 

Example 1.4 The M-N blood group frequencies in Iceland were given in Example 1.1. 
The observed numbers in the sample were as in the following table. The gene frequen¬ 
cies in the sample are first calculated from the observed numbers by equation [1.1]. Then 
the Hardy-Weinberg genotype frequencies, p 2 , 2pq and q 2 are calculated from the gene 
frequencies by equation [1.2], and each is multiplied by the total number to get the numbers 
expected. For example, the expectation for MM is (0.5696) 2 x 747. Comparing the 
observed with expected numbers shows a deficiency of both homozygotes and an excess 
of heterozygotes. The x 2 tests how well, or how badly, the observed numbers agree 
with the expected. The discrepancy is not significant and could easily have arisen by 
chance in the sampling. Note that this x 2 has only 1 degree of freedom because the gene 
frequency has been estimated from the data, so that the observed and expected numbers 
must agree in their gene frequencies as well as in their totals. 



Genotypes 




Gene frequencies 

MM 

MN 

NN 

Total 

M 

N 

Numbers observed 

233 

385 

129 

747 

0.5696 

0.4304 

Numbers expected 

242.36 

366.26 

138.38 

747 



X? = 1-96 

P ~ 0.2 







The test for agreement with an equilibrium population is a test of whether the 
conditions for the production of Hardy—Weinberg genotype frequencies have been 
fulfilled. The conclusions that can be drawn from the test, however, are limited. 
When good agreement is found, the test gives no reason to doubt the fulfilment of 
all the conditions. Tests made with blood-group genes nearly always show very good 
agreement, as in Example 1.4. But there is one condition whose non-fulfilment will 
not lead to a discrepancy, and that is equal fertility among the parents. The reason 
for this will be explained in a moment. If the test reveals a discrepancy between 
the observed and expected frequencies, we can conclude that one or more of the 
conditions has not been fulfilled. But the nature of the discrepancy does not allow 
us to identify its source, or decide which condition has not been met. The reason 
for this is that the same discrepancy can arise from different causes. For example, 
an excess of heterozygotes can result from selective elimination of homozygotes, 
or from the gene frequency being different in males and females of the parental 
generation. The test is not as simple as it seems, and we must look more closely 
at what it does. 

The Hardy—Weinberg law relates genes in parents to genotypes in progeny. 
Therefore, to test it fully, we need to know the gene frequency in the parents and 
to calculate the expected genotype frequencies in the progeny from the parental gene 
frequency. But for the test described we have only the progeny. We find the gene 
frequency in them by counting. We then say: if this was the gene frequency among 
the gametes that produced these progeny, the genotypes should be in the Hardy— 
Weinberg proportions as calculated from the observed gene frequency. If the gene 
frequency was not the same in the parents as in the progeny, we have used the wrong 
gene frequency to calculate the expectations. Reference to Table 1.1 will show that 
the conditions tested are random mating, equal gene frequencies in the two sexes 
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of parents, and equal viability among the progeny; but equal fertility among the 
parents is not tested. Selection could therefore be acting through fertility and not 
be detected by this test. Selection acting through the viability of the progeny will 
lead to disagreement between the observed and expected frequencies. It is not possible, 
however, to identify the genotype or genotypes that have reduced viability. The reason 
for this will be explained in the next chapter, after the effects of selection have been 
dealt with. For fuller discussions of the limitations of the test see Wallace (1958, 
1968), Prout (1965); and for fuller consideration of its statistical aspects see Smith 
(1970). 

Mating frequencies and another proof of the Hardy— Weinberg law 
Let us now look more closely into the breeding structure of a random-mating popula¬ 
tion, distinguishing the types of mating according to the genotypes of the pairs, and 
seeing what are the genotype frequencies among the progenies of the different types 
of mating. This provides a general method for relating genotype frequencies in suc¬ 
cessive generations, which will be used in a later chapter. It also provides another 
proof of the Hardy—Weinberg law; a proof more cumbersome than that already given 
but showing more clearly how the Hardy—Weinberg frequencies arise from the 
Mendelian laws of segregation. The procedure is to obtain first the frequencies of 
all possible mating types according to the frequencies of the genotypes among the 
parents, and then to obtain the frequencies of genotypes among the progeny of each 
type of mating according to the Mendelian ratios. 

Consider a locus with two alleles, and let the frequencies of genes and genotypes 
in the parents be, as before. 


Genes 

Genotypes 


A, A 2 

AjA| A 1 A 2 

a 2 a 2 

Frequencies p q 

P H 

Q 


There are altogether nine types of mating, and their frequencies when mating is ran¬ 
dom are found by multiplying together the marginal frequencies as shown in Table 
1.3. Since the sex of the parent is irrelevant in this context, some of the types of 
mating are equivalent, and the number of different types reduces to six. By summa¬ 
tion of the frequencies of equivalent types, we obtain the frequencies of mating types 
in the first two columns of Table 1.4. Now we have to consider the genotypes of 
offspring produced by each type of mating, and find the frequency of each genotype 
in the total progeny, assuming, of course, that all types of mating are equally fertile 
and all genotypes equally viable. This is done in the right-hand side of Table 1.4. 
Thus, for example, matings of the type AjAj X A^ produce only AjAj offspring. 
So, of the total progeny, a proportion P 2 are A,A t genotypes derived from this type 
of mating. Similarly, one-quarter of the offspring of AjA 2 X A,A 2 matings are 
A)A,. So this type of mating, which has a frequency of H 2 , contributes a propor¬ 
tion | H 2 of the total AjAj progeny. To find the frequency of each genotype in the 
total progeny we add the frequencies contributed by each type of mating. The sums, 
after simplification, are given at the foot of Table 1.4, and from the identity given 
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Table 1.3 


Genotype and frequency of 
female parent 





A]A, 

P 

AjA 2 

H 

a 2 a 2 

Q 

<3 © R 

A]Aj 

P 

P 2 

PH 

PQ 

& S ^ 

2 s* 

c Cr 

A t A 2 

H 

PH 

H 2 

HQ 

3^1 

A2A2 

Q. 

PQ 

HQ 

Q 2 


Table 1.4 


Mating 



Genotype and frequency of progeny 


Type 



Frequency 

A|Aj 

AiA 2 

A 2 A 2 

A,A, 

X 

AiAj 

P 2 

P 2 

— 

— 

A,Aj 

X 

AiA 2 

2 PH 

PH 

PH 

— 

AjA[ 

X 

a 2 a 2 

2 PQ 

— 

2PQ 

— 

A]A 2 

X 

AiA 2 

H 2 

\H 2 

kH 2 

kH 2 

a,a 2 

X 

A2 A .2 

2 HQ 

— 

HQ 

HQ 

A2A2 

X 

A2A2 

Q 2 

— 

— 

Q 2 




Sums 

(P+±H) 2 

2(P+kH)(Q+$H) 

{Q+kHf 




= 

P 2 

2pq 

q 2 


in equation [1.1] they are seen to be equal to p 2 , 2 pq, and q 2 . These are the 
Hardy—Weinberg equilibrium frequencies, and we have shown that they are attained 
by one generation of random mating, irrespective of the genotype frequencies among 
the parents. 

Multiple alleles 

Restriction of the treatment to two alleles at a locus suffices for many purposes. 
If we are interested in one particular allele, as often happens, then all the other alleles 
at the locus can be treated as one. Formulation of the situation in terms of two alleles 
is therefore often possible even if there are in fact more than two. If we are interested 
in more than one allele we can still, if we like, treat the situation as a two-allele 
system by considering each allele in turn and lumping the others together. But the 
treatment can be easily extended to cover more than two alleles, and no new prin¬ 
ciple is introduced. In general, if q x and q 2 are the frequencies of any two alleles, 
Aj and A 2 , of a multiple series, then the genotype frequencies under Hardy— 
Weinberg equilibrium are as follows: 
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Genotype 

A]A, 

A]A 2 

a 2 a 2 

Frequency: q 2 

2?!?2 

<72 


These frequencies are also attained by one generation of random mating. This can 
readily be seen by reducing the situation to a two-allele system, and considering 
each allele in turn. Or it can be proved, though somewhat more laboriously, by the 
method explained above for the two-allele system. 

Example 1.5 The ABO blood groups in man are determined by a series of allelic genes. 
For the purpose of illustration we shall recognize three alleles, A, B, and O, and show 
how the gene frequencies can be estimated from the blood-group frequencies. Since O 
is recessive to both A and B, Hardy—Weinberg frequencies have to be assumed for 
estimating the gene frequencies. Let the frequencies of the A, B, and O genes be p, 
q, and r respectively, so that p + q + r = 1. The following table shows: (1) the blood 
groups (i.e. phenotypes); (2) the genotypes represented in each group; (3) the expected 
frequencies of the blood groups in terms of p, q, and r, on the assumption of Hardy- 
Weinberg equilibrium; (4) observed frequencies of blood groups in a sample of 190,177 
U.K. airmen, quoted by Race and Sanger (1954). 


(1) Blood group 

(2) Genotype 

Frequency 

(3) expected 

(4) observed % 

A 

AA + AO 

p 2 + 2 pr 

41.716 

B 

BB 4" BO 

q 2 + 2 qr 

8.560 

O 

oo 

r 2 

46.684 

AB 

AB 

2 pq 

3.040 


Calculation of the gene frequencies is rather more complicated than in the case of two 
alleles. The following is the simplest method: other methods, giving maximum-likelihood 
estimates, are described by Yasuda and Kimura (1968) and Elandt-Johnson (1971). First, 
the frequency of the O gene is simply the square root of the frequency of the O group. 
Next it will be seen that the sum of the frequencies of the B and O groups is q 2 + 2 qr 
+ r 2 = {q + r) 2 = (1 — p) 2 . So p = 1 — V(Z? + O), where B and O are the fre¬ 
quencies of the blood groups B and O. In the same way q = \ — \I(A + O), and we 
have seen that r = VO. This method gives the following gene frequencies in the sample: 


A gene: 

p = 0.2567 

B gene: 

q = 0.0598 

0 gene: 

r = 0.6833 

Total 

0.9998 


The reason why the estimates of the gene frequencies do not add up exactly to unity 
is that the genotypes in the sample are not the exact Hardy-Weinberg proportions. The 
AB group has not been used in arriving at the estimates, so some information has been 
lost. If the unused phenotype had been at a higher frequency, the loss of information 
might have been more serious and a more exact method might have been needed. 
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Sex-linked genes 

With sex-linked genes the situation is rather more complex than with autosomal genes. 
The relationship between gene frequency and genotype frequency in the homogametic 
sex is the same as with an autosomal gene, but the heterogametic sex has only two 
genotypes and each individual carries only one gene instead of two. For this reason 
two-thirds of the sex-linked genes in the population are carried by the homogametic 
sex and one-third by the heterogametic. For the sake of brevity the heterogametic 
sex will be referred to as male. Consider two alleles, Aj and A 2 , with frequencies 
p and q, and let the genotypic frequencies be as follows: 


Females 



Males 


A[A] 

A|A 2 

a 2 a 2 

Ai 

A 2 

Frequency: P 

H 

Q 

R 

s 


The frequency of A! among the females is then pf = P + \H, and the frequency 
among the males is p m = R. The frequency of A t in the whole population is 

P= iPf + 3 Pm 

= 3 (2 Pf + p m ) 

= H2P + H + R) 

Now, if the gene frequencies among males and among females are different, the 
population is not in equilibrium. The gene frequency in the population as a whole 
does not change, but its distribution between the two sexes oscillates as the popula¬ 
tion approaches equilibrium. The reason for this can be seen from the following 
consideration. Males get their sex-linked genes only from their mothers; therefore 
p m is equal to p/ in the previous generation. Females get their sex-linked genes 
equally from both parents; therefore pf is equal to the mean of p m and pf in the 
previous generation. Using primes to indicate the progeny generation, we have 

Pm = Pf 

Pf = 2 (Pm + Pf) 

The difference between the frequencies in the two sexes is 

Pf ~ Pm = 2 (p m + Pf) ~ Pf 
= “ 2 ip f - p m ) 

i.e. half the differences in the previous generation, but in the other direction. 
Therefore the distribution of the genes between the two sexes oscillates, but the dif¬ 
ference is halved in successive generations and the population rapidly approaches 
an equilibrium in which the frequencies in the two sexes are equal. Figure 1.2 
illustrates the approach to equilibrium with a gene frequency of 2/3, when the popula¬ 
tion is started by mixing females of one sort (all A]A|) with males of another sort 
(all A 2 ) and letting them breed at random. 



Example 1.6 Searle (1949) gives the frequencies of a number of genes in a sample of 
cats in London. The animals examined were sent to clinics for destruction; they were 
therefore not necessarily a random sample. Among the genes studied was the sex-linked 
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Fig. 1.2. Approach to equilibrium under random mating for a sex-linked gene, showing the 
gene frequency among females, among males, and in the two sexes combined. The population 
starts with females all of one sort {q^ = 1), and males all of the other sort (q m = 0). 

gene formerly known as ‘yellow’ but now called ‘orange’ (O). All three genotypes in 
females are recognizable, the heterozygote being ‘tortoiseshell’ or ‘calico’. The data were 
tested against the Hardy—Weinberg expectations, to see particularly if there was any 
evidence of non-random mating. The first test is to see whether the gene frequency is 
the same in the two sexes. Then the genotypes in females are tested against the Hardy- 
Weinberg law in the same way as was done in Example 1.4. The numbers in each 
phenotypic class are shown in the table, with the gene frequencies calculated from them. 
The gene frequency is a little higher in males, but not significantly so. There is therefore 
no reason so far to think the population was not in equilibrium. The appropriate gene 
frequency for calculating the expected genotype frequencies in females is taken, for 


Numbers of individuals 



Females 




Males 




+ + 

+0 

OO 

Total 

+ 

O 

Total 

Observed 



7 

338 

311 

42 

353 

Expected 


61.2 

P = 0.04 

3.4 

338 





Numbers of genes 



Frequencies of O-gene 


+ 

O 


Total 

q 



in females 

608 

68 


676 

0.101 



in males 

311 

42 


353 

0.119 
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simplicity, to be the gene frequency observed in females. The expectations, calculated 
in the same way as in Example 1.4 are given in the table. The numbers observed do 
not agree very well with expectation, but with the small expected numbers the discrepancy 
is only doubtfully significant. The discrepancy, if real, might possibly have been due 
to non-random mating, but it might also have been due to human preferences for the 
colours having biased the sample and made it unrepresentative of the breeding popula¬ 
tion. For a more extensive analysis and discussion of cat populations, see Metcalfe and 
Turner (1971). 


More than one locus 

The attainment of the equilibrium in genotype frequencies after one generation of 
random mating is true of all autosomal loci considered separately. But it is not true 
of the genotypes with respect to two or more loci considered jointly. To illustrate 
the point, suppose there were two populations, one consisting entirely of A 1 A 1 B 1 B 1 
genotypes and the other entirely of A 2 A 2 B 2 B 2 genotypes. Suppose that these two 
populations were mixed, with equal numbers of each sex, and allowed to mate at 
random. With two alleles at each of two loci there are nine possible genotypes, but 
only three of these would appear in the first-generation progeny, the two original 
double homozygotes and the double heterozygote. There would be complete associa¬ 
tion between the traits determined by the two loci, and the two traits would appear 
to be determined by a single gene difference. With continued random mating the 
missing genotypes would appear in subsequent generations, but not immediately at 
their equilibrium frequencies, and the initial association between the traits would 
be progressively reduced. If the two loci were linked, the attainment of equilibrium 
frequencies would take longer because the appearance of the missing genotypes 
depends on recombination between the two loci. Disequilibrium with respect to two 
or more loci is called gametic phase disequilibrium, or linkage disequilibrium, 
irrespective of whether the loci are linked or not. Disequilibrium can arise from 
intermixture of populations with different gene frequencies, or from chance in small 
populations. Disequilibrium can also be produced, and maintained, by selection 
favouring one combination of alleles over another. The rate at which a random 
breeding population approaches equilibrium can be deduced as follows. 

We first need a measure of the amount of disequilibrium. This is best expressed 
in terms of the frequencies of gametic types, rather than of zygotic genotypes. Con¬ 
sider two loci, each with two alleles, and gene frequencies as shown in Table 1.5. 


Table 1.5 


Genes 

Ai 

A 2 

B, 

b 2 

Gene frequencies 

Pa 

<Ia 

Pb 

* 7 b 

Gametic types 

A,B, 

a,b 2 

A 2 B[ 

A 2 B 2 

Frequencies, equilibrium 

PaPb 

PaQb 

QaPb 


Frequencies, actual 

r 

s 

t 

u 

Difference from equilibrium 

+D 

-D 

— D 

+D 
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There are then four types of gamete. The population is in equilibrium if the gametes 
contain randbm combinations of the genes. The gametic frequencies at equilibrium 
therefore depend only on the gene frequencies, and are as shown in the table. Let 
the actual, non-equilibrium, frequencies be r, s, t, and u, as shown. Each of these 
differs from the equilibrium frequency by an amount D, two gametic types having 
a positive, and two a negative, deviation. The value of D for each gametic type is 
necessarily the same, except for the sign. The amount of disequilibrium is measured 
by D. The disequilibrium can be expressed by reference to genotypes by comparing 
the frequencies of coupling and of repulsion double heterozygotes. The genotype 
A 1 B 1 /A 2 B 2 can be called a coupling heterozygote, whether the two loci are linked 
or not. Its frequency is 2 ru. The repulsion heterozygote is A^/^B! and its fre¬ 
quency is 1st. If the population is in equilibrium, these two genotypes have equal 
frequencies. The relationship with D is 

D — ru — st 

Thus D is equal to half the difference in frequency between coupling and repulsion 
heterozygotes. 

When a population in linkage disequilibrium mates at random, the amount of dis¬ 
equilibrium is progressively reduced with each succeeding generation. The rate at 
which this happens depends on the frequency of gametic types in two successive 
generations. This is perhaps easiest to visualize if the two loci are thought of as 
being linked on the same chromosome. The disequilibrium D in the progeny genera¬ 
tion can be obtained from the frequency of any of the four gametic types, so let 
us consider only the A^ type. This can appear in the progeny gametes in two 
ways. First, it can be produced as a non-recombinant from the genotype A 1 B 1 /A X B JC , 
the subscript x meaning that either of the two alleles can be present. The frequency 
with which A 1 B 1 is produced in this way is r(l — c), r being the frequency of A^j 
in the parental gametes and c the recombination frequency. Or, second, it can be 
produced as a recombinant from the genotype A^/A^B]. The frequency of the 
A^j chromosome is p A and that of the A x Bj chromosome is p B . So the frequency 
with which A t Bj arises in this way is PaPb c - Therefore the frequency of A]Bi in 
the progeny gametes is 

r' = r(l — c) + p^c 

and the disequilibrium in the progeny generation is 

D' = r' - PaPb 

= r(\ - c) - p^l - c ) 

= ir - paPb)(i - c) 

= D{ 1 - c) 

If we take the process one generation further we get 

D " = D'{ 1 - c) = D{ 1 - cf 

Thus, after any number t of generations, the disequilibrium is given by 

A = AO - cy ... [i.5] 

The loci do not have to be linked to be in disequilibrium. With unlinked loci 
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Fig. 1.3. Approach to equilibrium under random mating of two loci, considered jointly. The 
graphs show the amount of disequilibrium, D, relative to the disequilibrium in generation 0. The 
five graphs refer to different degrees of linkage between the two loci, as indicated by the 
recombination frequency shown alongside each graph. The graph marked 0.5 refers to unlinked 
loci. 


c = 2 and the amount of disequilibrium is halved by each generation of random 
mating. With linked loci the disequilibrium disappears more slowly. Figure 1.3 shows 
how the disequilibrium is reduced over 12 generations, with different degrees of 
linkage. 

The approach to equilibrium given by the above equation applies equally to the 
disequilibrium of any number of loci considered jointly, provided (1 — c ) is defined 
as the probability of a gamete passing through a generation without recombination 
between any of the loci. The larger the number of loci the smaller is the probability 
of no recombination; with two unlinked loci it is k, with three i, and with four i 
Thus the multilocus disequilibrium decays faster than the 2-locus, which soon comes 
to dominate the total disequilibrium among a number of loci. A practical consequence 
of this is that when a number of loci are available for study, disequilibrium is more 
likely to be found with pairs of loci than with larger numbers considered jointly. 
For details of three loci see Crow and Kimura (1970) and for methods of estimation 
see Weir and Cockerham (1979). 

Linkage disequilibrium has consequences which will have to be taken into con¬ 
sideration in later chapters, particularly in connection with selection for quantitative 
characters. Knowledge of the disequilibrium in a population can be useful in several 
ways (see Crow, 1986, p. 23): it can tell us something about the breeding history 
of the population, which single loci cannot because they come to Hardy—Weinberg 
frequencies after a single generation of random mating; and disequilibrium of a disease 
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gene with a closely linked restriction fragment length polymorphism (RFLP) can 
be useful in genetic counselling. 

Non-random mating 

There are two distinct forms of non-random mating. The first is when mated in¬ 
dividuals are related to each other by ancestral descent. This tends to increase the 
frequencies of homozygotes at all loci. Its effect will be described in Chapter 3. 
The second is when individuals tend to mate preferentially with respect to their 
genotypes at any particular locus under consideration. This form of non-random 
mating is dealt with briefly here. 

Assortative mating 

If mated pairs are of the same phenotype more often than would occur by chance, 
this is called assortative mating , and if less often, it is called disassortative mating. 
To the extent that the phenotype reflects the genotype, assortative or disassortative 
mating affects the genotype frequencies. The effects are described by Crow and 
Kimura (1970) and will be only briefly outlined here. Assortative mating is of some 
importance in human populations, where it occurs with respect to stature, intelligence, 
and other characters. These, however, are not single gene differences such as can 
be discussed in the present context. Disassortative mating is widespread in the self¬ 
sterility system of plants. 

The consequences of assortative mating with a single locus can be deduced from 
Table 1.4 by appropriate modification of the frequencies of the types of mating to 
allow for the increased frequency of matings between like phenotypes. The effect 
on the genotype frequencies among the progeny is to increase the frequencies of 
homozygotes and reduce that of heterozygotes. In effect the population becomes par¬ 
tially subdivided into two groups, mating taking place more frequently within than 
between the groups. If assortative mating is continued in successive generations, 
the population approaches an equilibrium at which the genotype frequencies remain 
constant. 

Disassortative mating has consequences that are, in general, opposite to those of 
assortative mating: it leads to an increase of heterozygotes and a reduction of 
homozygotes. Disassortative mating, however, usually has the additional consequence 
of changing the gene frequency. If mating is predominantly between unlike 
phenotypes, then the rarer phenotype has a better chance of success in mating than 
has the commoner phenotype. Consequently the rarer alleles are favoured, and the 
gene frequency changes toward intermediate values at which the phenotypes are equal 
in frequency. A familiar example of disassortative mating is the bisexual mode of 
reproduction which leads immediately to a gene, or chromosome, frequency of 0.5. 
Self-sterility mechanisms of plants are based on multiple alleles, and the favouring 
of the rarer alleles results in the coexistence of a large number of alleles, all at more 
or less equal frequencies. 

Problems 

1.1 The following numbers of the human M-N blood groups were recorded in a 
sample of American Whites. 
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M MN N 

1787 3039 1303 
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(1) What are the genotype frequencies observed in this sample? 

(2) What are the gene frequencies? 

(3) With the gene frequencies observed, what are the genotype frequencies expected 
from the Hardy—Weinberg law? 

(4) How well do the observed frequencies agree with the expectation? 

Data from Wiener, A.S. (1943) quoted by Stern, C. (1973) Principles of Human 
Genetics. Freeman, San Francisco. 


[Solution 1] 

1.2 About 30 per cent of people do not recognize the bitter taste of phenyl-thio- 
carbamate (PTC). Inability to taste it is due to a single autosomal recessive gene. 
What is the frequency of the non-tasting gene, assuming the population to be in 
Hardy—Weinberg equilibrium? 

[Solution 11] 

1.3 Albinism occurs with a frequency of about 1 in 20,000 in European popula¬ 
tions. Assuming it to be due to a single autosomal recessive gene, and assuming 
the poplation to be in Hardy—Weinberg equilibrium, what proportion of people are 
carriers? Only an approximate answer is needed. 

[Solution 21] 

1.4 As an exercise in algebra, work out the gene frequency of a recessive mutant 
in a random-breeding population that would result in one-third of normal individuals 
being carriers. 


[Solution 31] 

1.5 Three allelic variants, A, B, and C, of the red cell acid phosphatase enzyme 
were found in a sample of 178 English people. All genotypes were distinguishable 
by electrophoresis, and the frequencies in the sample were 

Genotype AA AB BB AC BC CC 

Frequency (%) 9.6 48.3 34.3 2.8 5.0 0.0 

What are the gene frequencies in the sample? Why were no CC individuals found? 

Data from Spencer, N., et al. (1964) Nature, 201, 299—300. 


[Solution 41] 

1.6 About 7 per cent of men are colour-blind in consequence of a sex-linked recessive 
gene. Assuming Hardy—Weinberg equilibrium, what proportion of women are 
expected to be (1) carriers, and (2) colour-blind? (3) In what proportion of marriages 
are both husband and wife expected to be colour-blind? 


[Solution 51] 
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1.7 Sine oculis (so) and cinnabar (cn) are two autosomal recessive genes in 
Drosophila melanogaster. They are very closely linked and can be treated as if they 
were alleles at one locus. The ‘heterozygote’, so/cn, is wild-type and is distinguishable 
from both homozygotes; (so/so has no eyes; cn/cn has white eyes if the stock is 
made homozygous for another eye-colour mutant, brown, bw). In a class experi¬ 
ment 4 males and 4 females of an so /so stock were put in a vial together with 16 
males and 16 females from a cn/cn stock and allowed to mate. There were 20 such 
vials. The total count of progeny, classified by genotype, was as follows. 

so/so so/cn cn/cn 

135 359 947 

How do these numbers differ from the Hardy-Weinberg expectations? Suggest a 
reason for the discrepancy. 


[Solution 61] 

1.8 Suppose that Drosophila cultures are set up in vials as described in Problem 
1.7, but this time with a gene frequency of 0.5. This is done by putting 10 males 
and 10 females of each stock in each vial. The supply of so/so females ran out and 
only 4 were left for the last vial. So, to preserve the intended gene frequency and 
numbers of parents, this vial was made up as follows: 16a cr +4 9 9 of so/so with 
4cr ct +16 9 9 of cn/cn. The student who got this vial was a bit surprised by what 
he found. What genotype frequencies would you expect in the progeny? 

[Solution 71] 

1.9 Prove that when there are any number of alleles at a locus the total frequency 
of heterozygotes is greatest when all alleles have the same frequency. What is then 
the total frequency of heterozygotes? 


[Solution 81] 

1.10 Suppose that a strain of genotype AA BB is mixed with another strain of 
genotype aa bb, with equal numbers of the two strains and equal numbers of males 
and females, which mate at random. Call this generation of parents generation 0. 
Subsequent generations also mate at random and there are no differences of fertility 
or viability among the genotypes. What will be the frequency of the genotype AA 
bb in the progeny of generation 2, i.e. after two generations of recombination, if 
the two loci are (1) unlinked, (2) linked with a recombination frequency of 20 per 
cent? 


[Solution 91] 

1.11 How will the solutions of Problem 1.10 be altered if the two strains are crossed 
by taking males of one strain and females of the other? 


[Solution 101] 



2 CHANGES OF GENE FREQUENCY 


We have seen that a large random-mating population is stable with respect to gene 
frequencies and genotype frequencies, in the absence of agencies tending to change 
its genetic properties. We can now proceed to a study of the agencies through which 
changes of gene frequency, and consequently of genotype frequencies, are brought 
about. There are two sorts of process: systematic processes, which tend to change 
the gene frequency in a manner predictable both in amount and in direction; and 
the dispersive process, which arises in small populations from the effects of samp¬ 
ling, and is predictable in amount but not in direction. In this chapter we are concerned 
only with the systematic processes, and we shall consider only large random-mating 
populations in order to exclude the dispersive process from the picture. There are 
three systematic processes: migration, mutation, and selection. We shall study these 
separately at first, assuming that only one process is operating at a time, and then 
we shall see how the different processes interact. 

Migration 

The effect of migration is very simply dealt with and need not concern us much 
here, though we shall have more to say about it later, in connection with small popula¬ 
tions. Let us suppose that a large population consists of a proportion m of new im¬ 
migrants in each generation, the remainder, 1 — m, being natives. Let the frequency 
of a certain gene be q m among the immigrants and q 0 among the natives. Then the 
frequency of the gene in the mixed population, q x , will be 

<h = m dm + (1 - m)q 0 

= m{q m - q 0 ) + q 0 ... [2.1] 

The change of gene frequency, A q, brought about by one generation of immigration 
is the difference between the frequency before immigration and the frequency after 
immigration. Therefore 


Aq = q x - q 0 

= rn{q m - q 0 ) ... [2.2] 

Thus the rate of change of gene frequency in a population subject to immigration 
depends, as must be obvious, on the immigration rate and on the difference of gene 
frequency between immigrants and natives. 
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Mutation 

The effect of mutation on the genetic properties of the population differs according 
to whether we are concerned with a mutational event so rare as to be virtually unique, 
or with a mutational step that recurs repeatedly. The first produces no permanent 
change in a large population, whereas the second does. 

Non-recurrent mutation 

Consider first a mutational event that gives rise to just one representative of the 
mutated gene or chromosome in the whole population. This sort of mutation is of 
very little importance as a cause of change of gene frequency, because the product 
of a unique mutation has only a very small chance of surviving in a large popula¬ 
tion. The original mutated gene is present in a heterozygote and its chance of being 
lost in the next generation is one-half. If it survives, it may be represented by one 
or more copies, but each copy has only a one-half chance of surviving to the third 
generation. The loss is permanent, so the chance of indefinite survival is very small 
indeed, and is zero in an infinitely large population. Because real populations are 
not infinitely large, unique mutations must be expected very occasionally to survive 
indefinitely and lead to a change of gene frequency. More will be said about this 
later in this chapter and in Chapter 4. 

Recurrent mutation 

It is with the second type of mutation — recurrent mutation — that we are chiefly 
concerned as an agent for causing change of gene frequency, and in a large popula¬ 
tion the frequency of a mutant gene is never so low that complete loss can occur 
from sampling. We have, then, to find out what is the effect of this ‘pressure’ of 
mutation on the gene frequency in the population. 

Suppose gene A, mutates to A 2 with a frequency u per generation, (u is the pro¬ 
portion of all Aj genes that mutate to A 2 between one generation and the next.) If 
the frequency of Aj in one generation is p 0 , the frequency of newly mutated A 2 
genes in the next generation is wp 0 . So the new gene frequency of Aj is p 0 — up 0 , 
and the change of gene frequency is —up 0 . Now consider what happens when the 
genes mutate in both directions. Suppose for simplicity that there are only two alleles, 
A! and A 2 , with initial frequencies p 0 and q 0 . A, mutates to A 2 at a rate u per 
generation, and A 2 mutates to Aj at a rate v. Then after one generation there is a 
gain of A 2 genes equal to up 0 due to mutation in one direction, and a loss equal 
to vq 0 due to mutation in the other direction. Stated in symbols, we have the 
situation: 

Mutation rate Aj ^ A 2 

Initial gene frequencies po q 0 

Then the change of gene frequency in one generation is 

Aq = up 0 - vq 0 ... [2.3] 

It is easy to see that this situation leads to an equilibrium in gene frequency at which 
no further change takes place, because if the frequency of one allele increases fewer 
of the other are left to mutate in that direction and more are available to mutate 
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in the other direction. The point of equilibrium can be found by equating die change 
of frequency, Aq , to zero. Thus at equilibrium 


or 


and 


pu = qv 

P_ = 
q u 

u 

q = —— 
u+v 


.. . [2.4] 


Two conclusions can be drawn from the effect of mutation on gene frequency. 
Mutation rates are generally very low — about 10 5 or 10 ~ 6 per generation for 
most loci in most organisms. This means that between about 1 in 100,000 and 1 
in 1,000,000 gametes carry a newly mutated allele at any particular locus. With 
normal mutation rates, therefore, mutation alone can produce only very slow changes 
of gene frequency; on an evolutionary time-scale they might be important, but they 
could scarcely be detected by experiment except with microorganisms. The second 
conclusion concerns the equilibrium between mutation in the two directions. Studies 
of reverse mutation (from mutant to wild type) show that it is usually much less 
frequent than forward mutation (from wild type to mutant), (Muller and Oster, 1957; 
Schlager and Dickie, 1971). If reverse mutation were one-tenth as frequent as for¬ 
ward mutation, the equilibrium gene frequency resulting from mutation alone would 
be 0.1 of the wild type allele and 0.9 of the mutant; in other words the ‘mutant’ 
would be the common form and the ‘wild type’ the rare form. Since this is not the 
situation found in natural populations, it is clear that the frequencies of such genes 
are not the product of mutation alone. We shall see in the next section that the rarity 
of mutant alleles is attributable to selection. 


Selection 

Hitherto we have supposed that all individuals in the population contribute equally 
to the next generation. Now we must take account of the fact that individuals differ 
in viability and fertility, and that they therefore contribute different numbers of off¬ 
spring to the next generation. The contribution of offspring to the next generation 
is called the fitness of the individual, or sometimes the adaptive value, or selective 
value. If the differences of fitness are in any way associated with the presence or 
absence of a particular gene in the individual’s genotype, then selection operates 
on that gene. When a gene is subject to selection its frequency in the offspring is 
not the same as in the parents, since parents of different genotypes pass on their 
genes unequally to the next generation. In this way selection causes a change of 
gene frequency, and consequently also of genotype frequency. The change of gene 
frequency resulting from selection is more complicated to describe than that resulting 
from mutation, because the differences of fitness that give rise to the selection are 
an aspect of the phenotype. We therefore have to take account of the degree of 
dominance shown by the genes in question. Dominance, in this connection, means 
dominance with respect to fitness, and this is not necessarily the same as the 
dominance with respect to the main visible effects of the gene. Most mutant genes, 
for example, are completely recessive to the wild type in their visible effects, but 
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Fig. 2.1. Degrees of dominance with respect to fitness. 


this does not necessarily mean that the heterozygote has a fitness equal to that of 
the wild-type homozygote. The meaning of the different degrees of dominance with 
which we shall deal is illustrated in Fig. 2.1. 

It is most convenient to think of selection acting against the gene in question, in 
the form of selective elimination of one or other of the genotypes that carry it. This 
may operate either through reduced viability or through reduced fertility in its widest 
sense, including mating ability, or through both. In the life-cycle of individuals, 
selection acts first through viability, then through fertility. We therefore have to 
deduce the change of gene frequency from the zygote stage of one generation to 
the zygote stage of the progeny generation. Gene frequencies cannot be observed 
in zygotes, so there are practical difficulties in deducing the selective forces from 
observed changes of gene frequency, but we shall return to these later. The strength 
of the selection is expressed as the coefficient of selection, s, which is the propor¬ 
tionate reduction in the gametic contribution of a particular genotype compared with 
a standard genotype, usually the most favoured. The contribution of the favoured 
genotype is taken to be 1, and the contribution of the genotype selected against is 
then 1—5. This expresses the fitness of one genotype relative to the other. Sup¬ 
pose, for example, that the coefficient of selection is 5 = 0.1; the fitness is then 
0.9, which means that for every 100 zygotes produced by the favoured genotype, 
only 90 are produced by the genotype selected against. Fitness, defined in this way 
as the proportionate contribution of offspring, should strictly speaking be called 
relative fitness, but it will be referred to as fitness throughout what follows. 

The fitness of a genotype with respect to any particular locus is not necessarily 
the same in all individuals. It depends on the environmental circumstances in which 
the individual lives, and also on the genotype with respect to genes at other loci. 
When we assign a certain fitness to a genotype, this refers to the average fitness 
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of this genotype in the whole population. Though differences of fitness between 
individuals result in selection being applied to many, perhaps to all, loci simul¬ 
taneously, we shall limit our attention here to the effects of selection on the genes 
at a single locus, supposing that the average fitness of the different genotypes remains 
constant despite the changes resulting from selection applied simultaneously to other 
loci. The conclusions we shall reach apply equally to natural selection occurring 
under natural conditions without the intervention of man, and to artificial selection 
imposed by the breeder or experimenter through his choice of individuals as parents 
and through the number of offspring he chooses to rear from each parent. 


Change of gene frequency under selection 

We have first to derive the basic formulae for the change of gene frequency brought 
about by one generation of selection. Then we can consider what they tell us about 
the effectiveness of selection. The different conditions of dominance have to be taken 
account of, but the method is die same for all, and it will be illustrated by reference 
to the case of complete dominance with selection acting against the recessive 
homozygote. Table 2.1 shows the genotypes with their Hardy—Weinberg frequen¬ 
cies before selection. A 2 A 2 is the recessive homozygote with a coefficient of selec¬ 
tion s acting against it. The next line gives the fitness of each genotype. Multiplying 
the initial frequency by the fitness gives the frequency of each genotype after selec¬ 
tion. This is entered as the ‘gametic contribution’ in order to allow for selection 
to operate over the whole life-cycle. Note that after selection the total frequency 
is no longer unity, because there has been a proportionate loss of sq 2 due to the 
selection. To find the frequency of A 2 gametes produced — and so the frequency 
of A 2 genes in the progeny — we take the gametic contribution of A 2 A 2 individuals 
plus half that of AjA 2 individuals and divide by the new total, i.e., we apply 
equation [1.1]. Thus the new gene frequency is 


<h = 


q\ 1 - s) + pq 
l — sq 2 


This can be simplified by substituting p = (1 — q). Rearrangement then gives 


<h = 


9 ~ sq 2 
l — sq 2 


... [2.5] 


Table 2.1 Selection against a recessive gene 



Genotypes 



AjA] 

A]A 2 

A 2 A 2 

Total 

Initial frequencies 

P 2 

2 pq 

q 2 

1 

Coefficient of selection 

0 

0 

s 


Fitness 

1 

1 

1 — s 


Gametic contribution 

/> 2 

2 pq 

q\ 1 - s) 

1 — sq 2 
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The change of gene frequency, A q, resulting from one generation of selection is 

Aq ='q x - q 

Substituting for q { from equation [2.5], and after some rearrangement, this becomes 


sq\\ - q) 




1 — sq 2 


... [ 2 . 6 ] 


From this we see that the effect of selection on gene frequency depends not only 
on the intensity of selection s , but also on the initial gene frequency. But both rela¬ 
tionships are somewhat complex, and the examination of their significance will be 
postponed till after the other situations have been dealt with. 

Expressions for the new gene frequency and for the change of gene frequency, 
with different conditions of dominance, are given in Table 2.2. The general expres¬ 
sion (5) in the table allows Aq to be worked out for any degree of dominance with 
respect to fitness. Two expressions (3 and 4) are given for a completely dominant 
gene, according to the direction of selection. The first, which was derived above, 
is for selection against the recessive homozygote. If, in contrast, selection is against 
the dominant phenotype, Aq is not quite the same. The difference may best be 


Table 2.2 Change of gene frequency by one generation of selection, with different 
conditions of dominance for fitness, as specified below. The initial gene frequency of 
A 2 is q. 


Initial frequencies and 
fitness of genotypes 

New gene 
frequency 

Change of 
gene frequency 

AiA] 

P 2 

a,a 2 

2pq 

A 2 A 2 

<I\ 

Aq 

= <i\ - q 

(1) 1 

1 - b 

1 - 5 

q - %sq ~ 2 sq 2 


2*4(1 - q) 

1 — sq 


1 — sq 

(2) 1 

1 — hs 

1—5 

q — hspq — sq 2 


spq[q + h(p - 4 )] 

1 — Ihspq — sq 2 


1 — 2 hspq — sq 2 

(3) 1 

1 

1—5 

q - sq 2 

1 — sq 2 

- 

54 2 ( 1 - q) 

1 — 54 2 

(4) 1 — 5 

1—5 

1 

q — sq + sq 2 

+ 

*? 2 (i - q) 

1 - 5(1 - q 2 ) 

1 - 5(1 - 4 2 ) 

(5) 1 - 

1 

1 — 5 2 

q - s 2 q 2 

+ 

pq(sq> - s 2 q) 

1 - sqj 2 - s 2 q 2 

1 - sj? 2 - s 2 q 2 


(1) No dominance; selection against A 2 . 

(2) Partial dominance of Aj; selection against A 2 . 

(3) Complete dominance of Aj; selection against A 2 . 

(4) Complete dominance of Aj; selection against Aj. 

(5) Overdominance; selection against AjAj and A 2 A 2 . (Applicable also to any degree of 
dominance with fitnesses expressed relative to AjA 2 .) 
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appreciated by considering the effects of total elimination, when s = l. The for¬ 
mula for selection against the dominant phenotype then reduces to q x = 1, which 
expresses the fact that if only the recessive homozygotes breed the gene frequency 
goes to 1 immediately. Total elimination of the recessive homozygote, on the other 
hand, will leave all of the recessive genes that are present in heterozygotes. The 
difference between the effects of selection in opposite directions becomes less marked 
as the value of s decreases. All the forms of selection mentioned so far tend in the 
end to eliminate one or other allele from the population. Overdominance for fitness, 
where heterozygotes are superior in fitness, in contrast, tends to maintain both alleles 
in the population. This form of selection will be given more detailed attention later. 

The expressions for A q in Table 2.2 are rather cumbersome and it is often useful 
to simplify them by an approximation that is good enough for many purposes. If 
either the coefficient of selection s, or the gene frequency q, is small, then the 
denominators of the equations in Table 2.2 become very nearly unity, and we can 
use the numerators alone as expressions for A q. Then for selection in either direc¬ 
tion we have, with no dominance: 

A q = ± i sq{\ - q) (approx.) 
and with complete dominance: 

A q = ± sq 2 ( 1 — q) (approx.) 

Effectiveness of selection 
We see from the formulae that the effectiveness of selection, i.e., the magnitude 
of A q, depends on the initial gene frequency q. The nature of this relationship is 
best appreciated from graphs showing A q at different values of q. Figure 2.2 shows 
these graphs for the cases of no dominance and complete dominance. They also 
distinguish between selection in the two directions. A value of 5 = 0.2 was chosen 
for the coefficient of selection because, for reasons given in Chapter 12, this seems 
to be the right order of magnitude for the coefficient of selection operating on genes 
concerned with metric characters in laboratory selection experiments. First we may 
note that with this value of s there is never a great difference in A q according to 
the direction of selection. The two important points about the effectiveness of selec¬ 
tion that these graphs demonstrate are: (1) selection is most effective at intermediate 
gene frequencies and becomes least effective when q is either large or small; (2) 
selection for or against a recessive gene is extremely ineffective when the recessive 
allele is rare. This is the consequence of the fact, noted earlier, that when a gene 
is rare it is represented almost entirely in heterozygotes. 

Another way of looking at the effect of the initial gene frequency on the effec¬ 
tiveness of selection is to plot a graph showing the course of selection over a number 
of generations, starting from one or other extreme. Such graphs are shown in Fig. 
2.3. They were constructed directly from those of Fig. 2.2, and refer again to a 
coefficient of selection 5 = 0.2. They show that the change due to selection is at 
first very slow, whether one starts from a high or low initial gene frequency; it 
becomes more rapid at intermediate frequencies and falls off again at the end. In 
the case of a fully dominant gene one is chiefly interested in the frequency of the 
homozygous recessive genotype, i.e., q 2 . For this reason the graph shows the effect 
of selection on q 2 instead of on q. 


... [2.7] 

.. . [ 2 . 8 ] 
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Fig. 22. Change of gene frequency, A q, under selection of intensity s = 0.2, at different values of 
initial gene frequency, q. Upper figure: a gene with no dominance. Lower figure: a gene with 
complete dominance. The graphs marked (—) refer to selection against the gene whose 
frequency is q, so that A q is negative. The graphs marked ( +) refer to selection in favour of the 
gene, so that A q is positive. (After Falconer, 1954.) 


Example 2.1 Figure 2.4 shows the change of gene frequency of an autosomal recessive 
lethal in Drosophila melanogaster described by Wallace (1963). The population was started 
from flies that were all heterozygotes, and the gene frequency in generation 0 was con¬ 
sequently q = 0.5. The parents of each subsequent generation were a random sample 
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Fig. 2-3. Change of gene frequency during the course of selection from one extreme to the other. 
Intensity of selection, 5 = 0.2. Upper figure: a gene with no dominance. Lower figure: a gene 
with complete dominance, q being the frequency of the recessive allele and q 1 that of the 
recessive homozygote. The graphs marked (-) refer to selection against the gene whose 
frequency is?, so that q or q 2 decreases. The graphs marked ( + ) refer to selection in favour of 
the gene, so that q or q 1 increases. (After Falconer, 1954.) 

of the surviving progeny of the previous generation. Only heterozygotes and normal 
homozygotes survived. Heterozygotes were identified by test matings. About 100 to 200 
flies were tested in each generation, giving a count of about 200 to 400 genes from which 
to estimate the gene frequency. The observed gene frequency ± two standard errors 
is plotted for each of 10 successive generations. 

Expected gene frequencies were calculated for each generation by the formulae for 
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Fig. 2.4. Change of gene frequency under natural selection in the laboratory, as described in 
Example 2.1. (Adapted from Wallace, 1963.) 


<7i in Table 2.2. Two expectations were calculated. The first, shown by the broken line, 
assumes the lethal to be completely recessive. With s = 1, the formula in line (3) of 
Table 2.2 reduces to q , = qi( 1 + q). The observed results suggest that the gene fre¬ 
quency was reduced a little faster than would be expected for a completely recessive 
lethal gene. The second expectation shown by the dotted line, assumes the fitness of 
heterozygotes was reduced by 10 per cent. With s = 1, the formula in line (2) of Table 
2.2 reduces to q x = pq( 1 — h)/[p 2 -I- 2pq(l — h)], and this was evaluated with h = 
0.1. The results agree well with this expectation. 

Number of generations required 

How many generations of selection would be needed to effect a specified change 
of gene frequency? An answer to this question may be required in connection with 
breeding programmes or proposed eugenic measures. We shall consider only the 
case of selection against a recessive when elimination of the unwanted homozygote 
is complete, i.e., s — 1. This would apply to natural selection against a recessive 
lethal, and to artificial selection against an unwanted recessive in a breeding pro¬ 
gramme. We shall also, for the moment, suppose that there is no mutation. The 
expression for the new gene frequency after one generation was given in equation 
[2.5] (and in line (3) of Table 2.2). 
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Substituting s = 1 in this equation and writing q 0 , q 2 , ..., q t for the gene 

frequency after 0, 1, 2, ..., / generations of selection, we have 


and 


Q\ = 


<?2 = 


<lo 


1 + <?o 

1 + q l 
<lo 


1 + 2q 0 

by substituting for q { and simplifying. So in general 


Q, = 


go 

1 + tq 0 


•.. [2.9] 


and the number of generations, t, required to change the gene frequency from q 0 
to q, is 


go ~ <lt 
go gf 
l _l 

<h go 


... [ 2 . 10 ] 


The example below illustrates the point, already made, that when the frequency of 
a recessive gene is low, selection is very slow to change it. 


Example 2.2 It is sometimes suggested, as a eugenic measure, that those suffering from 
serious inherited defects should be prevented from reproducing, since in this way the 
frequency of such defects would be reduced in future generations. Before deciding whether 
the proposal is a good one, we ought to know what it would be expected to achieve. 
We cannot properly discuss this problem without taking mutation into account, as we 
shall do later; the answer we get, ignoring mutation, shows what is the best that could 
be hoped for. Albinism will serve as an example, though it is not a very serious defect. 
Supposing albinism to be due to a single recessive gene, how long would it take to reduce 
the frequency of albino individuals to half its present value? The present frequency among 
European people is about 1/20,000. This is ql, and it gives q 0 = 1/141. The objective 
is q] = 1/40,000, q t = 1/200. So, from equation [2.10], t = 200 — 141 = 59 genera¬ 
tions. With 25 years to a generation it would take nearly 1,500 years to achieve this 
modest objective. Albinism is not, in fact, a single genetic entity, but can be caused 
by at least two different recessive genes, each at a frequency lower than 1/141. So in 
reality the elimination would be even slower. 

In domesticated species, of course, the elimination of deleterious genes can be 
greatly speeded up by progeny testing. Test-matings are made to known 
heterozygotes, and this identifies heterozygotes among those tested. The gene can 
then be eliminated very quickly. The gene persists only in heterozygotes that have 
been misclassified as normal through an inadequate number of progeny. 
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Average fitness and load 

When the gene frequency is changed by selection, some individuals must suffer 
‘genetic death’ by their failure to survive or to reproduce, and the average fitness 
of the population is thereby reduced. The proportion of the population that suffers 
genetic death is called the load borne by the population as a consequence of the 
presence of the deleterious gene in it. If L is the load, the average fitness of the 
population is 1 — L. The average fitness and the load were deduced in Table 2.1 
without being specifically pointed out. The average fitness is the total of the genotype 
frequencies after selection and is the denominator of all the expressions for q x or 
A q given in Table 2.2. For a recessive gene, for example, the average fitness is 
1 — sq 2 and the load is sq 2 . The average fitness is again relative fitness, relative 
to a population that does not have the deleterious gene in it. The load is not necessarily 
a real detriment to the population, because most species produce more offspring 
than the resources of its environment can support, and the death of some individuals 
from genetic causes leaves room for others that would otherwise have died from 
lack of food or some other cause. There is a species of Drosophila, for example 
(D. tropicalis, from Central America), in which 50 per cent of individuals in a cer¬ 
tain locality suffer genetic death, and yet the population flourishes (Dobzhansky and 
Pavlovsky, 1955). 

Equilibria 

Balance between mutation and selection 

Having described the effects of mutation and selection separately, we must now com¬ 
pare them and consider them jointly. Which is the more effective process in causing 
change of gene frequency? Is it reasonable to attribute the low frequency of deleterious 
genes that we find in natural populations to the balance between mutation tending 
to increase the frequency and selection tending to decrease it? The expressions already 
obtained for the change of gene frequency under mutation or selection alone show 
that both depend on the initial gene frequency, but in different ways. Mutation to 
a particular gene is most effective in increasing its frequency when the mutant gene 
is rare (because there are more of the unmutated genes to mutate); but selection is 
least effective when the gene is rare. The relative effectiveness of the two processes 
depends therefore on the gene frequency, and if both processes operate for long 
enough a state of equilibrium will eventually be reached. So we must find what the 
gene frequency will be when equilibrium is reached. This is done by equating the 
two expressions for the change of gene frequency, because at equilibrium the change 
due to mutation will be equal and opposite to the change due to selection. 

Let us consider first a fully recessive gene with frequency q, mutation rate to it 
u, and from it v, and selection coefficient against it 5. Then from equations [2.3] 
and [2.6], we have at equilibrium 


w(l - q) - vq 


sq 2 (l - q) 
1 - sq 2 


... [ 2 . 11 ] 


This equation is too complicated to give a clear answer to our question. But we can 
make two simplifications with only a trivial sacrifice of accuracy. We are specifically 
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interested in genes at low equilibrium frequencies. If q is small, the term vq represent¬ 
ing back mutation is relatively unimportant and can be neglected; and we can use 
the approximate expression (equation [2.8]) for the selection effect. Making these 
simplifications we have the equilibrium condition for selection against a recessive 
gene: 

u( 1 - q) = 
u = 

q = 

For a gene with no dominance, similar reasoning from line (1) in Table 2.2 gives 
the equilibrium condition 


sq 2 (1 — q) (approx.) 

sq 2 (approx.) ... [2.12] 

(approx.) ... [2.13] 



2 u 

q = - (approx.) 

s 


... [2.14] 


Finally, consider selection against a completely dominant gene, the frequency of 
the dominant gene being 1 — q, and the mutation rate to it being v. In this case 
1 - q is very small and the term w( 1 - q) in equation [2.11] is negligible. We 
have therefore at equilibrium 


or 


vq ~ sq 2 { 1 — q) (approx.) 
v 

q( 1 - q) = — (approx.) 

s 

TT 2v 

H= - (approx.) ...[2.15] 

s 


where H is the frequency of heterozygotes. If the mutant gene is rare, H is very 
nearly the frequency of the mutant phenotype in the population. 


Example 2.3 If the equilibrium state is accepted as applicable, we can use it to get an 
estimate of the mutation rate of dominant abnormalities, for which the coefficient of 
selection is known. Among some human examples described by Haldane (1949) is the 
case of dominant dwarfism (chondrodystrophy) studied in Denmark. The frequency of 
dwarfs was estimated at 10.7 x 10 5 , and their fitness (1 - 5) at 0.196. The estimate 
of fitness was made from the number of children produced by dwarfs compared with 
their normal sibs. The mutation rate, by equation [2.15], comes out at 4.3 X 10 -5 . 
Though there is a possibility of serious error in the estimate of frequency owing to prenatal 
mortality of dwarfs, the mutation rate is almost certainly estimated within the right order 
of magnitude. The mutation rate to recessives cannot be reliably estimated in this way 
because the estimate is very sensitive to small departures from equilibrium. (For more 
about mutation rates in man, see Stem, 1973). 


These expressions for the equilibrium gene frequency under the joint action of 
mutation and selection show that the gene frequency can have any value at 
equilibrium, depending on the relative magnitude of the mutation rate and the coef¬ 
ficient of selection. But if mutation rates are of the order of magnitude commonly 
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accepted, i.e., 10 -5 or thereabouts, then only a mild selection against the mutant 
gene will be needed to hold it at a very low equilibrium frequency. For example, 
if a gene mutates at the rate of 10 “ 5 , a selective disadvantage of 10 per cent is 
enough to hold the frequency of the recessive homozygote at 1 in 10,000; and a 
50 per cent disadvantage will hold it at 1 in 50,000. It is quite clear therefore that 
the low frequency of deleterious mutants in natural populations is in accord with 
what would be expected from the joint action of mutation and selection. 

Let us now consider the load, or proportion of genetic deaths, when a population 
is in equilibrium. The load from a recessive gene is sq 2 , as explained earlier. The 
equilibrium equation [2.12] therefore shows that the load at equilibrium is 

L = u ... [2.16] 

Thus the load depends only on the mutation rate and not at all on how seriously 
deleterious the gene is. The reason for this surprising conclusion is that a more 
deleterious gene comes to equilibrium at a lower gene frequency; there are therefore 
fewer homozygotes, though more of them die. With a less deleterious gene there 
are more homozygotes, but fewer of them die. Since the load from each locus does 
not depend on the selection coefficient, the total load from recessive alleles at all 
loci is simply the sum of the mutation rates, Em. With deleterious dominant genes, 
the homozygotes are so rare that they can be neglected. The load therefore comes 
from the death of heterozygotes and is therefore L = sH, where H is the frequency 
of heterozygotes. Substituting the equilibrium frequency of heterozygotes from equa¬ 
tion [2.15] gives the load at equilibrium as 

L = 2v ... [2.17] 

where v is the mutation rate to the dominant allele. Again the load is not affected 
by the harmfulness of the gene. Comparison of equations [2.16] and [2.17] raises 
another question. Why should the load from a dominant gene be twice that from 
a recessive with the same mutation rate? The reason is that the death of a mutant 
homozygote removes two genes from the population whereas the death of a 
heterozygote removes only one mutant gene. Equation [2.17] seems to suggest that 
the loss of one gene by the death of a heterozygote balances the introduction of two 
genes by mutation. This is not so because the loss by death is expressed per individual, 
whereas the gain by mutation is expressed per gamete: the mutation rate per individual 
is 2v. The load from partially dominant alleles is between u and 2m, and the total 
load from all loci is between Em and 2Em, where u is the mutation rate to alleles 
with any degree of dominance. If mutation rates are about 10 ~ 5 , an organism with 
10,000 loci capable of mutation to deleterious alleles would have a total load of be¬ 
tween 10 and 20 per cent; that is to say, about 1 or 2 zygotes in 10 would die as 
a result of mutation. 

The fact that recessive genes at low frequencies respond only very slowly to selec¬ 
tion makes it very unlikely that rare recessives are at their equilibrium frequencies 
in real populations. Unless the environmental conditions remain exceptionally con¬ 
stant over a long period, selection coefficients are likely to change faster than selec¬ 
tion can adjust the frequency to each new equilibrium value. This, of course, would 
not apply to genes that are lethal under all conditions. There is also another reason, 
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which applies to lethals too, for thinking that present-day human populations are 
not in equilibrium. Modern civilization has reduced the subdivision into local, par¬ 
tially inbreeding, groups, and this has reduced the frequency of homozygotes as will 
be explained in the next chapter. In consequence, both the gene frequencies and 
the homozygote frequencies are below their equilibrium values, and must be presumed 
to be at present increasing slowly toward new equilibria at higher values. Deduc¬ 
tions about rare recessives based on the supposition of equilibrium therefore cannot 
be made, particularly for human populations. 

Changes of equilibrium 

Mutation rates can be increased by artificially produced radiation or environmental 
chemicals; selection coefficients can be reduced by medical treatment or by domestica¬ 
tion, or they can be increased by eugenic measures. What effects would be expected 
from these changes? Whatever the change, there will be a new equilibrium gene 
frequency toward which the population will start to move. The effect on the fre¬ 
quency of affected individuals, when the new equilibrium is reached, can readily 
be seen from equations [2.13] and [2.15]: for example, doubling the mutation rate, 
or halving the selection coefficient, would eventually double the frequency of affected 
individuals. The immediate effect of increasing the mutation rate depends on the 
coefficient of selection against the gene — the lower the selection coefficient, the 
slower the approach to equilibrium. The consequences of an increased mutation rate 
would unquestionably be harmful. (For an assessment of the consequences see Crow, 
1957). The consequences of changing the selection coefficient one way or the other, 
however, need some comment. 

Intensification of selection has sometimes been advocated as a eugenic measure 
for human populations. Example 2.2 showed how extremely slow such measures 
applied to a recessive gene would be to make a worthwhile reduction of its frequency. 
When mutation is taken into consideration the prospects are seen to be even worse. 
Not only is mutation hindering the selection, but it puts a limit — the equilibrium 
for s = 1 — below which the frequency cannot be reduced. Serious defects, moreover, 
have already a fairly strong natural selection working on them, and the addition of 
artificial selection can do no more than make the coefficient of selection s equal 
to 1. This would probably seldom do more than double the present coefficient of 
selection, and the incidence of defects would be reduced to not less than half their 
present values. 

Perhaps the reduced intensity of natural selection under modem conditions should 
give us more concern. Minor genetic defects, such as colour-blindness, must 
presumably have had some selective disadvantage in the past but now have very 
little, if any, effect on fitness. Moreover, medical treatment removes, or reduces, 
the selection pressure against susceptibility to a variety of diseases that have at least 
some degree of genetic causation. This relaxation of natural selection suggests that 
the frequencies of the genes concerned will increase toward new equilibria at higher 
values. If this is true we must expect the incidence of minor genetic defects to in¬ 
crease in the future, and also the proportion of people who need medical treatment 
for a variety of diseases. By applying humanitarian principles for our own good 
now we are perhaps laying up a store of inconvenience for our descendants in the 
distant future. 
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Selection favouring heterozygotes 

We have considered the effects of selection operating on genes that are partially 
or fully dominant with respect to fitness; but, though the appropriate formula was 
given in Table 2.2, we have not yet discussed the consequences of overdominance 
with respect to fitness; that is, when the heterozygote has a higher fitness than either 
homozygote. At first sight it may seem rather improbable that selection should favour 
the heterozygote of two alleles rather than one or other of the homozygotes. There 
is good evidence, however, that it does occur, though opinion is divided on how 
common a situation it is. Let us first examine the consequences of this form of selec¬ 
tion, and then consider how it might operate. 

Selection operating on a gene with partial or complete dominance tends toward 
the total elimination of one or other allele, the final gene frequency, in the absence 
of mutation, being 0 or 1. When selection favours the heterozygote, however, the 
gene frequency tends toward an equilibrium at an intermediate value, both alleles 
remaining in the population, even without mutation. The reason is as follows. The 
change of gene frequency after one generation was given in Table 2.2 as being 

pq(sq? - s 2 q ) 

= -2-“2 

1 - - s 2 q 

The condition for equilibrium is that A q = 0, and this is fulfilled when sq) = s 2 q. 
The gene frequencies at this point of equilibrium are therefore 

P_ = ft ... [2.18] 

q si 

or 9=—-P-‘« 

Now, if q is greater than its equilibrium value (but not 1), and p therefore less, sq) 
will be less than s 2 q, and A q will be negative; that is to say q will decrease. Sim¬ 
ilarly, if q is less than its equilibrium value (but not 0) it will increase. Therefore 
when the gene frequency has any value, except 0 or 1, selection changes it toward 
the intermediate point of equilibrium given in equation [2.19], and both alleles re¬ 
main permanently in the population. Three or more alleles at a locus can be main¬ 
tained in the same way. The selective forces required are, however, less simple; 
see Crow and Kimura (1970, p. 277). A feature of the equilibrium worthy of note 
is that the gene frequency depends not on the degree of superiority of the heterozygote 
but on the relative disadvantage of one homozygote compared with that of the other. 
Therefore there is a point of equilibrium at some more or less intermediate gene 
frequency whenever a heterozygote is superior to both the homozygotes, no matter 

by how little. 2 

The load resulting from overdominance for fitness is sq) + s 2 q , from the 
denominator of the expression for A q. Substituting the equilibrium value for q from 
equation [2.19], and the analogous value for p, leads to the following expression 
for the load at equilibrium: 


s i + s 2 


... [ 2 . 20 ] 
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Substitution of s 2 = s^p/q and separately of s 1 = s 2 q/p from equation [2.18] leads to 

L = = s 2 q ... [2.21] 

Thus the load depends on the selection coefficients, unlike the load due to recurrent 
mutation, and the total load from all overdominant loci cannot be obtained in any 
simple way by summation. 

An example of heterozygote advantage is described below, and the possible causes 
of overdominance for fitness are then discussed. First, however, we must examine 
the effects of selection on observed genotype frequencies. 

Selection and the Hardy-Weinberg test. In the previous chapter a test was des¬ 
cribed which, by comparing observed with expected genotype frequencies, tests for 
the fulfilment of the conditions required for generating Hardy-Weinberg frequen¬ 
cies. Since the absence of selection is one condition, disagreement between observed 
and expected frequencies can provide evidence that selection has operated. But the 
conclusions that can be drawn are limited, and we must now look more closely into 
what the test can and cannot reveal. The test requires a locus at which heterozygotes 
can be distinguished, so that the gene frequency can be determined by counting. 
Usually the genotypes observed are those of adults in only one generation. They 
have been subject to selection through viability differences, but not yet to selection 
through fertility differences. The test does not reveal selection on fertility. If the 
parents differ in fertility, random mating produces Hardy-Weinberg frequencies 
in the zygotes that will become the adults observed. The test therefore can only reveal 
selection acting through viability. It is tempting to believe that the relative viabilities 
of the genotypes can be deduced from their deviations from expectation, particu¬ 
larly to see an excess of heterozygotes as evidence of heterozygote advantage. This, 
however, is not a valid deduction, for the following reason. The expectations are 
calculated from the observed gene frequency, and they are expectations based on 
the supposition that the gene frequency in the zygotes was the same as that observed 
in the adults. This supposition may not be correct, in which case the expectations 
will have been wrongly calculated. The consequence of calculating wrong expecta¬ 
tions is that an apparent excess of heterozygotes can result from selection acting 
against only one homozygote, i.e., from a gene that is completely recessive with 
respect to fitness. The numerical example in Table 2.3 will make clear what hap¬ 
pens. The zygotes are in Hardy-Weinberg frequencies corresponding to a gene fre¬ 
quency of 0.4 for the A 2 allele. Of the 16 A 2 A 2 zygotes, 6 survive to be counted 


Table 2.3 Spurious heterozygote advantage from the Hardy-Weinberg test. 



Genotypes 



Gene frequency 
of A 2 

A,A, 

A,A 2 

A 2 A 2 

Total 


Number of zygotes 

36 

48 

16 

100 

0.4 

Number of adults 

36 

48 

6 

90 


Frequency of adults (%) 

40.0 

53.3 

6.7 

100 

0.33 

H—W expectation (%) 

44.4 

44.5 

11.1 

100 




Equilibria 


41 


as adults, and the gene frequency is reduced to 1/3. With this observed gene fre¬ 
quency the Hardy—Weinberg expectations are calculated as shown. The result is 
that the observed frequency of heterozygotes is above expectation and those of both 
homozygotes are below their expectations. The test appears to indicate heterozygote 
superiority, but the selection in fact was against one homozygote only. The only 
situation in which heterozygote advantage can be inferred from an excess of 
heterozygotes is when the population is in equilibrium and the gene frequency is 
not changing and, furthermore, when there are no differences of fertility among the 
parents. Only then is the gene frequency the same in the adults as it was in the zygotes. 
It must be remembered, however, that an excess of heterozygotes results also from 
unequal gene frequencies in the male and female parents. A difference between the 
sexes can occur by chance if the sample of progeny used for the test is a small one. 
(This is explained in the next chapter.) 

Consider now what the observed genotype frequencies can tell us about the relative 
viabilities. Let us assign viabilities relative to that of the heterozygote. With Hardy— 
Weinberg frequencies in the zygotes, the genotype frequencies in the adults will 
be as follows, when P, H and Q are the observed frequencies. 


Genotype 

A,A, 

A]A 2 

^ 2^2 

Total 

Frequency in adults { p ^ ^ 

2 pq 

H 

q 2 ( i - s 2 ) 

Q 

1 - sj) 2 - s 2 q 2 

1 


The discrepancy can be expressed as PQ/{\H) 2 , which with Hardy—Weinberg fre¬ 
quencies is equal to 1. When selection is taken into account it can readily be shown that 

-^T = 0 - S|)(l - *2) • • • P-22] 

\ltl) 

This measure of the discrepancy is therefore an estimate of the product of the 
viabilities of the two homozygotes relative to the heterozygote. If one allele is known 
to be recessive, then the other homozygote has unreduced viability; e.g., if A 2 is 
recessive, (1 — Sj) = 1. Then the viability, (1 — s 2 ), of the recessive homozygote 
can be estimated. 

It will be seen from the brief account given that the estimation of relative fitnesses 
is not straightforward. For a fuller treatment of the problems, see Prout (1965). 

Example 2.4 Sickle-cell anaemia in man is a well-known example of heterozygote advan¬ 
tage. It is particularly useful as an example because the data allow a test of observation 
with theory. The disease is caused by the abnormal haemoglobin-S. Homozygotes suf¬ 
fer from a severe anaemia from which many die, yet the gene is present among Africans 
and their descendants in America at frequencies much too high to be accounted for by 
mutation counterbalancing the selection against homozygotes. The explanation of the 
high frequencies is that heterozygotes have an advantage over normal homozygotes through 
an increased resistance to malaria (Allison, 1954). The selective forces can be calculated 
from data given by Allison (1956) and one can then see how well these can account for 
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the observed gene frequency. Allison classified 287 infants and 654 adults, from a district 
of Tanzania, for genotype. (Homozygotes are recognized by the presence of red blood 
cells with a characteristic ‘sickle’ shape; heterozygotes are recognized by the sickling 
of their cells when the blood sample is deoxygenated.) The observed numbers and fre¬ 
quencies are shown in the table, with the gene frequencies calculated from them by equa¬ 
tion [1.1]. (AA denotes the normal homozygote, AS the heterozygote, and SS the anaemic 
homozygote.) Most of the differential selection is thought to take place before adulthood, 
i.e., the surviving genotypes do not differ much in fertility. The infants therefore repre¬ 
sent the genotype frequencies before selection and the adults after selection, and so we 
can calculate the selection coefficients from the observed frequencies. First, however, 
note that if the gene frequency is in equilibrium it will be the same after selection as 
it was before, and the data agree well with this expectation. Dividing the frequency of 
each genotype after selection by its frequency before selection gives the relative fitness 
of that genotype, as shown in the table. The fitnesses of the homozygotes can then be 
expressed relative to the heterozygote by dividing each by the heterozygote fitness. The 
homozygote fitnesses are 1 — s, and 1 — s 2 , from which the selection coefficients work 
out to be 0.24 against A A and 0.80 against SS, both relative to AS. The equilibrium 
gene frequency expected to result from this selection against both homozygotes, by equa¬ 
tion [2.19], is q s = 0.23, which is reasonably close to the observed value. Thus the 
selective forces observed in the differential viability do satisfactorily account for the 
frequency of the sickle-cell gene in this population. 



Genotype 



Frequency 
of S-gene 


AA 

AS 

SS 


Numbers of infants 

189 

89 

9 


adults 

400 

249 

5 


Frequency in infants 

0.6585 

0.3101 

0.0314 

0.1864 

adults 

0.6116 

0.3807 

0.0076 

0.1980 

Relative fitness 

0.9288 

1.2277 

0.2420 


Fitness relative to AS 

0.7565 

1 

0.1971 


Selection coefficient 

Si = 0.2435 

Expected q s = 

5 2 

Si 

1 - 0.2327 

S\ + s 2 

= 0.8029 



The selective values may be more interesting if expressed relative to the normal 
homozygote. The fitness of AS is then 1/(1 — S)) = 1.32, and that of SS is (1 — s 2 )l 
— ^i) = 0.26. Thus the resistance to malaria confers a 32 per cent advantage on the 
heterozygote, and this balances a 74 per cent disadvantage in the anaemic homozygote 
when the gene frequency is about 0.2. 

Possible causes of overdominance for fitness. Let us now consider some of the ways 
in which selection might operate so as to favour heterozygotes. One way is through 
pleiotropy, i.e., the gene having more than one phenotypic effect. To produce over¬ 
dominance for fitness, the alleles must affect two components of fitness in opposite 
directions. The heterozygote advantage of sickle-cell anaemia arises in this way; 
one homozygote reduces fitness through one component, the anaemia, while the other 
homozygote reduces fitness through another component, susceptibility to malaria. 
There are a few other genes in man where heterozygote advantage for similar reasons 
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is proved or suspected. Another example is the resistance of wild rats to the anti¬ 
coagulant poison warfarin (Greaves et al., 1977). The gene conferring resistance 
is dominant, so that heterozygotes and homozygotes are resistant. Homozygotes, 
however, have a much increased requirement for vitamin K, which is not met by 
the normal diet. So in areas where the poison is being used, one homozygote is 
selected against by the poison and the other by the vitamin K deficiency, leading 
to an equilibrium frequency of the resistance gene, which was about 0.34 in the 
area studied. 

There are many ways in which a locus can affect different components of fitness. 
For example, the components can be different stages of the life-cycle, different 
environments encountered by the same individual at different times, or by different 
individuals in different places, the two sexes, different combinations of genes at other 
loci which modify the locus in question. The conditions that produce overdominance 
for fitness are that the alleles affect the components in opposite directions and that 
there is some degree of dominance on the scale in which the components combine 
to give fitness. The meaning of the last condition is this: if the components are 
multiplied together to give fitness, then there must be some degree of dominance 
on the geometric scale but not necessarily on the arithmetic scale. To take a simple 
example, a hypothetical locus with two alleles in mice might affect the number born 
per litter and the number of litters as follows: 



Genotype 



A]A, 

a,a 2 

A 2 A 2 

Number per litter 

6 

7 

8 

Number of litters 

8 

7 

6 

Total number = fitness 

48 

49 

48 


Fitness is the product of the two components and there is overdominance for fitness. 
In their effects on the components separately, the alleles have no dominance on the 
arithmetic scale, but a small degree of dominance on the geometric scale, the 
geometric mean of the homozygous values being 6.9. Overdominance generated in 
this way is known as marginal overdominance (Wallace, 1968), meaning that the 
overdominance appears only in the margin of the table. 

Gametic phase disequilibrium of linked loci can generate pseudo-overdominance 
in a similar way. If two loci are closely linked so that they appear to be one, and 
if the favourable alleles are dominant and linked in repulsion, then the heterozygote 
may be superior to either homozygote. The possibility of pseudo-overdominance 
being caused by linkage makes it extremely difficult to establish real overdominance 
at single loci from observations on populations derived from crosses between dif¬ 
ferent strains, because it is formally impossible to exclude the presence of a closely 
linked but unrecognized locus. Wild populations of many Drosophila species have 
chromosomes with different gene arrangements carried in inverted segments. Inver¬ 
sion heterozygotes are generally superior in fitness to homozygotes (see Wallace, 
1968), and this heterozygote superiority is probably due to the linkage of the genes 
in the inversions. 
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Finally, overdominance can arise at the molecular level. If a locus codes for an 
enzyme, the products of the two alleles (allozymes) are likely to have different pro¬ 
perties, such as enzymatic activity, heat-stability, or optima for environmental fac¬ 
tors such as temperature or pH. The mixture of allozymes may therefore make the 
heterozygote more versatile than either of the homozygotes with single allozymes, 
i.e., less susceptible to the impairment of enzyme function by environmental cir¬ 
cumstances. Or, if the allozymes differ in activity, the intermediate activity of the 
heterozygote may be more favourable than the higher or lower activities of the 
homozygotes. For further details and discussion of the evidence for overdominance 
at the molecular level, see Berger (1976). 

We have seen that there are many ways by which overdominance for fitness could 
arise. It must be admitted, however, that the cases where it has been proved to occur 
are very few indeed. 

Polymorphism 

We saw earlier that the balance between mutation and selection satisfactorily accounts 
for the presence of deleterious genes at low frequencies, causing the appearance 
of rare abnormal, or mutant, individuals. Genes of this sort, however, are only a 
minor part of the genetic variation found in natural populations. There are many 
genes causing variants that are neither rare nor in any way abnormal, and the presence 
of these genes cannot easily be accounted for by the simple balance of selection against 
mutation. The blood-group genes used as examples in the first chapter are of this 
sort. More striking examples are the colour varieties found in many species, par¬ 
ticularly among insects, snails, and fish. The existence of these visible differences 
caused by genes at intermediate frequencies is called polymorphism, and the term 
is extended to cover all such variants, whether readily discernible or not. The term 
is also used to describe loci at which there are variant alleles at intermediate fre¬ 
quencies. Electrophoresis and other methods for detecting differences in the amino- 
acid composition of proteins have shown that very many loci coding for proteins 
are polymorphic, at least one-third of loci and perhaps much more in most organisms. 
It is clear therefore that many loci, perhaps the majority, carry allelic differences 
causing genetic variation between normal individuals. The ‘intermediate’ gene fre¬ 
quencies by which polymorphic loci are defined are arbitrary, but are usually taken 
to be in the range of 0.01 to 0.99 or, more strictly, the frequency of the commonest 
allele is taken to be not more than 0.99. The essential point is that the rarer alleles 
are at frequencies too high to be regarded as equilibrium frequencies for mutation 
balanced by selection, unless the selection is extremely weak. We therefore need 
another explanation for the widespread existence of polymorphic variation. There 
are two classes of explanation, the ‘selectionist’ and the ‘neutralist’. The selectionist 
view sees polymorphisms as balanced, i.e. as stable equilibria maintained by selec¬ 
tive forces. The neutralist view is that many polymorphisms have no functional 
significance, but result from the survival of mutant alleles by random changes of 
gene frequency in populations of finite size, a subject to be dealt with in the next 
three chapters. 

Which of these two types of explanation is thought to be the more likely will depend 
on the nature of the loci. At one extreme are the colour varieties found in many 
species. A functional significance of these colours can hardly be denied and these 
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polymorphisms seem to require a selectionist explanation. At the other extreme are 
the differences of nucleotide sequences in DNA found as restriction fragment length 
polymorphisms (RFLPs) or by other means. For many of these a neutralist explana¬ 
tion seems to be required, especially for those occurring in non-coding regions and 
those coding for synonymous base changes. Between these two extremes are the 
histocompatibility and the blood-group genes, which might be expected to be sub¬ 
ject to selection because of their associations with diseases; and the enzyme polymor¬ 
phisms which, depending on how the amino-acid substitutions affect the function, 
might or might not be subject to selection. 

Polymorphic loci give rise to the variation of quantitative characters, which is 
the main subject of this book. The possible causes of polymorphism, of which a 
brief summary follows, are therefore of interest to later chapters. 

1. Heterozygote advantage. We saw in the previous section that overdominance 
for fitness maintains an equilibrium gene frequency at intermediate levels, and that 
there are many ways in which overdominance for fitness can arise. Heterozygote 
advantage is therefore an attractive explanation for polymorphism. The paucity of 
proven cases need not be a difficulty because an advantage of the heterozygote so 
small as to be quite undetectable in practice would be enough to maintain a poly¬ 
morphism. There are, however, difficulties connected with the effects of inbreeding, 
and with the number of loci that could be kept in a polymorphic state by heterozygote 
advantage. For further discussion of the problem, see Lewontin (1974), Berger 
(1976), Wills (1978) and Kimura (1983). 

2. Frequency-dependent selection. Having a phenotype that is rare may itself be 
an advantage, irrespective of what the phenotype is. The direction of selection is 
then dependent on the gene frequency: an allele at low frequency produces the rare 
phenotype and is favoured, but the same allele at a high frequency is selected against. 
This leads to a stable equilibrium gene frequency and so to a balanced polymorphism. 
Many examples of frequency-dependent selection are known. Pollen grains bearing 
a rare self-sterility allele have a better chance of fertilizing an ovule because the 
same allele is seldom present in the stigmata of other plants. Birds and fish have 
been shown to take disproportionately more of the more common type of food when 
they are offered a choice (see, e.g., Allen, 1975), and this is thought to exert 
frequency-dependent selection on polymorphic prey, such as snails, giving an advan¬ 
tage to individuals with a rare pattern of coloration (see Clarke, 1969). In general, 
the development of special methods of attack and of defence in the relationships 
between predator and prey and between pathogen and host, seem likely to result 
in frequency-dependent selection. Frequency-dependent selection is reviewed by 
Ayala and Campbell (1974) and by Clarke (1979). 

3. Heterogeneous environment. The environment experienced by individuals of 
a population is not constant; it differs from place to place and varies with time. If 
one allele is advantageous in one environment and another in a different environ¬ 
ment, stable polymorphism can result without heterozygotes necessarily being on 
average superior. Selection can be thought of as tending to adapt different individuals 
to different environments. The situation is complex because the outcome depends 
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on many factors such as the dominance relations, whether individuals choose to breed 
in the environments to which they are adapted, whether mating is preferen tiall y be¬ 
tween individuals from the same environment or random, what proportions of the 
whole population inhabit each of the different environments, whether individuals 
each encounter only one environment (‘coarse-grained’ environment) or more than 
one (‘fine-grained’), and whether the heterogeneity of the environment is spatial 
or temporal. A relatively simple form of selection in a heterogeneous environment 
results in a cline. This is a gradient of gene frequency between one locality and 
another, one allele being at a high frequency at one end of the cline and at a low 
frequency at the other end. Clines are thought, or in many cases known, to be main¬ 
tained by selection favouring one allele in one locality and another allele in another 
locality, with a limited amount of migration which allows mating only between in¬ 
dividuals from neighbouring parts of the cline. The selection in opposite directions 
at the two ends and the ‘gene-flow’ up and down the cline, maintain the poly¬ 
morphism. The role of heterogeneous environments in maintaining polymorphism 
is reviewed by Felsenstein (1976). Evidence that it has an important role was ob¬ 
tained from a survey of the polymorphism and ecology of 243 species (Nevo, 1978). 
Theoretical considerations, however, show that the conditions under which polymor¬ 
phism would be maintained are rather restricted; see Hoekstra, Bijlsma, and Dolman 
(1985). 

4. Transition. Polymorphisms seen at present might possibly be transitional stages 
in the evolutionary replacement of one allele by another, which has become more 
advantageous through some environmental change in the past. This, however, is 
unlikely to be the explanation of more than a very small proportion of polymorphism. 

5. Neutral mutation. All the above mechanisms involve selection as the force 
responsible for the polymorphism. It is possible, however, that the selection coeffi¬ 
cients may be very small indeed, so small that mutation rates become a significant 
factor. The mutations that might give rise to the protein polymorphisms must be 
regarded as unique events, because the variant forms of the protein differ at only 
a few amino-acid sites, and the probability of getting the same amino-acid substitu¬ 
tion by recurrent mutation is very small indeed. When each mutation is treated as 
a unique event, taking the population size into account shows that mutation and chance 
can give rise to polymorphisms. When the population is not infinite in size a unique 
mutation has a small, but not zero, chance of not only surviving but of eventually 
spreading through the whole population. The smaller the population, the greater is 
the mutant’s chance of survival. The vast majority of new mutants will be lost, but 
a few will survive and replace the original allele. (The chance of survival will be 
considered further in Chapter 4.) Those few mutants that do survive take a very 
long time to spread through the population, and during this time they contribute to 
the polymorphism. Since the changes of gene frequency are dependent on chance, 
the spread of a mutant through the population is erratic, its frequency sometimes 
increasing and sometimes decreasing. Furthermore, some mutants will survive and 
increase in frequency at first, only to disappear later. If a new mutant has a selective 
advantage its chance of survival is, of course, increased. If it has a selective disad¬ 
vantage its chance is decreased, but it may nevertheless survive if the disadvantage 
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is small and the population size is small. According to the neutral-mutation theory 
some of the polymorphisms present at any time represent genes, nearly neutral with 
respect to fitness, that have mutated in the distant past and are still present in the 
population. 

The neutral theory was proposed by Kimura. A comprehensive account of it is 
given by Kimura (1983) and its salient features are explained by Crow (1986; see 
sections 2.4 and 7.1). It met at first with a good deal of scepticism because it was 
hard to believe that the differences between the alleles of a protein polymorphism 
could really be neutral with respect to fitness. Strict neutrality, however, is not re¬ 
quired by the theory, and mutants with a small selective advantage or disadvantage 
can be effectively neutral in populations that are not very large. Moreover, it has 
been shown that neutrality is to be expected as a consequence of evolution when 
natural selection favours an increased output from a metabolic pathway (Hartl, 
Dykhuizen, and Dean, 1985). There is now much evidence in support of the neutral 
theory, and neutral mutation is widely recognized as a major cause of polymorphism. 

Heterozygosity. A measure of the amount of polymorphism is needed for the com¬ 
parison of different populations and species, and for assessing the genetic variability 
of a population. One measure is the proportion of loci that are polymorphic among 
the loci tested. This measure, however, is unsatisfactory because it gives undue weight 
to rare alleles, which contribute little to the genetic variability. Furthermore, the 
decision whether to count loci with one common and one rare allele as polymorphic 
is an arbitrary one based on the allele frequencies. Also, the number of such loci 
that are found depends on the sample size; more rare alleles will be found in larger 
samples. A better measure is the average heterozygosity , which is the frequency of 
heterozygotes averaged over the loci tested. This is equivalent to the proportion of 
loci at which the ‘average individual’ is heterozygous. Rare alleles contribute little 
to the heterozygosity, as can be seen from Fig. 1.1. Heterozygosity can be expres¬ 
sed as an observed value or as an expected value calculated from the observed allele 
frequencies. The two differ if there is non-random mating in the population samp¬ 
led, for reasons to be explained in the next chapter. Average heterozygosities differ 
between species and taxonomic groups. Some values quoted by Crow (1986, p. 15) 
range from 4 per cent in mammals to 15 per cent in molluscs. Enzyme polymorphisms 
differ also according to the function of the enzyme. In a survey of 9 Drosophila 
species, one functional group of 12 enzymes had an average heterozygosity of 10 
per cent, whereas another group of 13 enzymes had an average of 23 per cent (Latter, 
1981). This survey, incidentally, led to the conclusion that the data fitted the neutral 
theory best. In general, the heterozygosity observed fits well with the expectations 
of the neutral theory (see Kimura, 1983, p. 320). 

Problems 

2.1 Rare white-flowered plants occur in populations of a Delphinium species which 
normally has deep blue flowers. In an area in the Rocky Mountains the frequency 
of white-flowered plants was 7.4 X 10 4 . White-flowered plants were found to set 
an average of 143 seeds per plant while blue-flowered plants set 229, the reduction 
in seed-production being due to discrimination by pollinators, which are bumble- 
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bees and humming-birds. On the assumption that white flowers are due to a single 
recessive gene, and that the population was in equilibrium, what rate of mutation 
would be needed to balance the selection? 

Data from Waser, N. M. & Price, M. V. (1981) Evolution, 35, 376-90. 

[Solution 12] 

2.2 If the white flowers in Problem 2.1 were due to a completely dominant gene, 
what would be the mutation rate needed to maintain equilibrium? 


[Solution 22] 

2.3 If an allele, A, mutates to a with a frequency of 1 in 10,000 and back-mutates 
with a frequency of 1 in 100,000, and if the three genotypes have equal fitnesses, 
what will be the genotype frequencies at equilibrium in a random-mating population? 

[Solution 32] 


2.4 Refer to Problem 2.3. What would be the consequences of doubling the mutation 
rate in both directions? 

[Solution 42] 

2.5 Medical treatment is, or will be, available for several serious autosomal 
recessive diseases. What would be the long-term consequences if treatment allowed 
sufferers from such a disease to have on average half the number of children that 
normal people have, whereas without treatment they would have no children? Assume 
that the present frequency is the mutation versus selection equilibrium, that in the 
long term a new equilibrium will be reached, and that no other circumstances change. 

[Solution 52] 

2.6 Cystic fibrosis is an autosomal recessive human disease with an incidence of 
about 1 in 2,500 live births among Caucasians. What would be the consequence 
in the immediately following generation if the mutation rate were doubled? Assume 
that the present frequency is the mutation versus selection equilibrium, that back- 
mutation is negligible, and that affected individuals have no children. Express your 
result as a percentage increase of incidence and as the number of additional cases 
per million births. 


[Solution 62] 

2.7 A careless Drosophila stock-keeper allows a stock of a dominant autosomal 
mutant to be contaminated by wild-type flies. Originally all flies were homozygous 
for the mutant, but after 10 generations some wild-type flies were found in the stock. 
Precautions were then taken to prevent further contamination. Suppose that we make 
the following assumptions: (i) In every generation 1 per cent of flies were 
contaminants, (ii) all contaminants were homozygous wild type, (iii) mutant and wild- 
type flies have equal fitness. With these assumptions what would be (1) the propor- 
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tion of wild-type flies in the generations after the last contamination, and (2) the 
proportion of heterozygotes among the flies with the mutant phenotype? 

[Solution 72] 

2.8 The two closely linked recessive genes of Drosophila described in Problem 
1.7 can be treated as alleles. Two populations were set up with initial gene frequen¬ 
cies of so of 0.2 in one and 0.8 in the other. After 7 generations of random breeding 
the gene frequency of so was close to 0.35 in both populations. What does this tell 
us about the selection operating? 

[Solution 82] 

2.9 The gene that makes wild rats resistant to the anticoagulant poison warfarin 
exhibits heterozygote advantage because rats homozygous for the resistance gene 
suffer from vitamin K deficiency. Heterozygotes are resistant to the poison and do 
not suffer from vitamin K deficiency. The proportion of resistant homozygotes that 
die from vitamin K deficiency was estimated to be 63 per cent. Susceptible 
homozygotes are not all killed when poison is applied to an area. A population under 
continuous treatment with poison came to equilibrium with the resistance gene at 
a frequency of 0.34. What percentage of all rats in this population will die in conse¬ 
quence of the resistance gene and the poisoning. 

Data from Greaves, J. H., et al. (1977) Genet. Res. 30, 257—63. 

[Solution 92] 

2.10 Suppose that two mutant genes are used in a class experiment on selection 
in Drosophila. In both cases heterozygotes are distinguishable from homozygotes but 
the genes are recessive with respect to fitness. (These are not known genes.) With 
gene (a) mutant homozygotes of both sexes have their fertility reduced by 50 per 
cent relative to the other genotypes, but have unimpaired viability. With gene (b) 
mutant homozygotes are fully fertile but both sexes have their pre-adult mortality 
increased by 50 per cent relative to the other genotypes. In both cases a parental 
population is made up of 30 O'O' + 30 99 homozygous wild type and 20 O'O' + 
20 9 9 homozygous mutant. What genotype frequencies will be found in the 
progeny? How do they compare with Hardy—Weinberg expectations based on 
the observed gene frequency in the progeny? What conclusions about the selection 
can be drawn from the frequencies in the progeny? Why does A q differ in the two 
cases? 


[Solution 102] 

2.11 Suppose a sex-linked trait due to a recessive gene has its genotypes in Hardy— 
Weinberg frequencies. A breeder then culls all affected individuals of both sexes. 
Derive an expression, in terms of the initial gene frequency, for the change of gene 
frequency resulting from one generation of selection. 


[Solution 112] 
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2.12 What is the approximate equilibrium gene frequency of a deleterious sex-linked 
recessive gene, when selection is balanced by a mutation rate of ul Human X-linked 
muscular dystrophy was found in a survey in England to have an incidence of 32.6 
per 100,000 males. The mutation rate was estimated from the number of ‘sporadic’ 
cases to be 10.5 X 10 -5 . Do these estimates agree with the expectation for a 
population in equilibrium when sufferers from the disease do not reproduce and 
carriers have normal survival and fertility? 

Data from Gardner—Medwin, D. (1970) J. Med . Genet., 7, 334—7. 

[Solution 122] 

2.13 Red coat colour in many breeds of cattle is due to an autosomal recessive 
gene, the dominant phenotype being black. Suppose that 1 per cent of red calves 
are born in a predominantly black breed, and suppose that it is desired to eliminate 
the red gene. Assuming the genotypes in the initial population to be in Hardy— 
Weinberg proportions, what proportion of red calves would there be after applying 
the following alternative selection procedures over two generations? (1) No red 
animals are used for breeding. (2) In addition to culling all red animals, all black 
bulls to be used for breeding are first tested by 6 progeny each from cows known 
to be heterozygotes. Any bull producing one or more red calves in the test is dis¬ 
carded. Cows used for breeding are not tested. 


[Solution 132] 



3 


SMALL POPULATIONS: 

I. Changes of Gene Frequency under Simplified 
Conditions 


We have now to consider the last of the agencies through which gene frequencies 
can be changed. This is the dispersive process, which differs from the systematic 
processes in being random in direction, and predictable only in amount. In order 
to exclude this process from the previous discussions we have postulated always 
a ‘large’ population, and we have seen that in a large population the gene frequen¬ 
cies are inherently stable. That is to say, in the absence of migration, mutation, or 
selection, the gene and genotype frequencies remain unaltered from generation to 
generation. This property of stability does not hold in a small population, and the 
gene frequencies are subject to random fluctuations arising from the sampling of 
gametes. The gametes that transmit genes to the next generation carry a sample of 
the genes in the parent generation, and if the sample is not large the gene frequen¬ 
cies are liable to change between one generation and the next. This random change 
of gene frequency is the dispersive process. 

In this chapter and the next we shall be concerned with the effects of the dispersive 
process on gene frequencies. If the deductions to be made about gene frequencies 
seem to be rather remote from reality, it should be remembered that the properties 
of a population with respect to any genetically determined character depend on gene 
frequencies. The conclusions are therefore fully relevant to quantitative characters 
to be dealt with in later chapters. 

There are, broadly speaking, four consequences of the dispersive process, which 
are to be explained and quantified in this chapter. These are not really different con¬ 
sequences, but rather different ways in which the consequences may be seen. They 
are: 

1. Random drift. The random changes of gene frequency are called random drift. 
If the gene frequency in any one small population is followed, it may be seen to 
change in an erratic manner from generation to generation, with no tendency to revert 
to its original value. 

2. Differentiation between sub-populations. Random drift occurring independently 
in different sub-populations leads to genetic differentiation between the sub¬ 
populations. The inhabitants of a large area seldom in nature constitute a single large 
population, because mating takes place more often between inhabitants of the same 
region. Natural populations are therefore more or less subdivided into local groups 
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or sub-populations, and these come to differ in gene frequencies if the number of 
individuals in the group is small. Domesticated or laboratory populations, in the 
same way, are often subdivided — for example, into herds or strains — and in them 
the subdivision and genetic differentiation are often more marked. 

3. Uniformity within sub-populations. Genetic variation within each sub-popula¬ 
tion becomes progressively reduced, and the individuals become more and more 
alike in genotype. This genetic uniformity is the reason for the widespread use of 
inbred strains of laboratory animals in many areas of biological research. (An inbred 
strain is one maintained as a small population over many generations.) 

4. Increased homozygosity. Homozygotes increase in frequency at the expense 
of heterozygotes. This, coupled with the general tendency for deleterious alleles to 
be recessive, is the genetic basis for the loss of fertility and viability that almost 
always results from inbreeding. 

There are two different ways of looking at the dispersive process and of deducing 
its consequences. One is to regard it as a sampling process and to describe it in terms 
of sampling variance. The other is to regard it as an inbreeding process and describe 
it in terms of the genotypic changes resulting from matings between related in¬ 
dividuals. Of these, the first is probably the simpler for a description of how the 
process works, but the second provides a more convenient means of quantifying 
its consequences. The plan to be followed here is first to describe the general nature 
of the dispersive process from the point of view of sampling. This will show how 
the four consequences come about. Then we shall approach the process afresh from 
the point of view of inbreeding, and show how the two viewpoints connect with 
each other. In all this we shall confine our attention to the simplest possible situa¬ 
tion, with migration, mutation, and selection excluded. Thus we shall see what hap¬ 
pens in small populations in the absence of other factors influencing gene frequency. 
In the next chapter we shall extend the conclusions to more realistic situations by 
removing the restrictive simplifications, and in Chapter 5 we shall consider the special 
cases of pedigreed populations and very small populations maintained by regular 
systems of close inbreeding. 

The idealized population 

In order to reduce the dispersive process to its simplest form we imagine an ideal¬ 
ized population as follows. We suppose there to be initially one large population 
in which mating is random, and this population becomes subdivided into a large 
number of sub-populations. The subdivision might arise from geographical or 
ecological causes under natural conditions, or from controlled breeding in 
domesticated or laboratory populations. The initial random-mating population will 
be referred to as the base population , and the sub-populations will be referred to 
as lines. All the lines together constitute the whole population, and each line is a 
small population’ in which gene frequencies are subject to the dispersive process. 
When a single locus is under discussion we cannot properly understand what goes 
on in one line except by considering it as one of a large number of lines. But what 
happens to the genes at one locus in a number of lines happens equally to those at 
a number of loci in one line, provided they all start at the same gene frequency. 
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Fig. 3.1. Diagrammatic representation of the subdivision of a single large population - the base 
population - into a number of sub-populations, or lines. 


So the consequences of the process apply equally to a single line provided we con¬ 
sider many loci in it. 

The simplifying conditions specified for the idealized population are the following: 

1. Mating is restricted to members of the same line. The lines are thus isolated 
in the sense that no genes can pass from one line to another. In other words migra¬ 
tion is excluded. 

2. The generations are distinct and do not overlap. 

3. The number of breeding individuals in each line is the same for all lines and 
in all generations. Breeding individuals are those that transmit genes to the next 
generation. 

4. Within each line mating is random, including self-fertilization in random amount. 

5. There is no selection at any stage. 

6. Mutation is disregarded. 

The situation implied by these conditions is represented diagrammatically in Fig. 
3.1, and may be described thus: All breeding individuals contribute equally to a 
pool of gametes from which zygotes will be formed. Union of gametes is strictly 
random. Out of a potentially large number of zygotes, only a limited number sur¬ 
vive to become breeding individuals in the next generation, and this is the stage at 
which the sampling of the genes transmitted by the gametes takes place. Survival 
of zygotes is random, and consequently the contribution of the parents to the next 
generation is not uniform, but varies according to the chances of survival of their 
progeny. Since the population size is constant from generation to generation, the 
average number of progeny that reach breeding age is one per individual parent or 
two per mated pair of parents. In this scheme the sampling is seen as a single event, 
the reduction of a large number of gametes to a small number of breeding progeny. 
The reduction of numbers may take place in several stages. This makes no difference 
to the theoretical consequences deduced from the final number of breeding progeny, 
provided the sampling at each stage is random. The observed consequences, however, 
would be affected if a population were enumerated at stages before the reduction 
of numbers was complete. 

The following symbols will be used in connection with the idealized population. 

N = the number of breeding individuals in each line and generation. This is the 
population size. 
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t = time, in generations, starting from the base population at t 0 . 

q = frequency of a particular allele at a locus. 

p = 1 — q = frequency of all other alleles at that locus, q and p refer to the 
frequencies in any one line; q and p refer to the frequencies in the whole 
population and are the means of q and p; q 0 and p 0 are the frequencies in 
the base population. 

Since all systematic processes tending to change the gene frequency have been 
excluded, the mean gene frequency among all lines at any stage must be the same 
as the initial frequency. Thus q = q 0 , and the two can be used interchangeably in 
this chapter. 

It is obvious that the conditions specified for the idealized population do not hold 
in real populations. The conclusions to be drawn in this chapter, however, can be 
made applicable to real populations by the simple device of replacing the population 
size N by the ‘effective’ population size N e , a concept to be introduced in the next 
chapter. 


Sampling 

Variance of gene frequency 

The change of gene frequency resulting from sampling is random in the sense that 
its direction is unpredictable. But its magnitude can be predicted in terms of the 
variance of the change. Consider the formation of the lines from the base popula¬ 
tion. Each line is formed from a sample of N individuals drawn from the base popula¬ 
tion. Since each individual carries two genes at a locus, the subdivision of the 
population represents a series of samples each of 2N genes, drawn at random from 
the base population. The gene frequencies in these samples will have an average 
value equal to that in the base population, i.e. q 0 , and will be distributed about this 
mean with a variance p Q q Q l2N, which is the binomial variance of sample means, 
the sample size being in this case 2N. This variance is the variance of q { , the gene 
frequency in the different lines after one generation. Since the initial gene frequency 
q 0 is the same for all lines, it is also the variance of (q x — q Q ), which is the change 
of gene frequency. Thus the change of gene frequency, A q, resulting from sampl¬ 
ing in one generation, can be stated in terms of its variance as 


2 

°Aq 


2N 


... [3.1] 


This variance of A q expresses the magnitude of the change of gene frequency resulting 
from the dispersive process. It expresses the expected change in any one line, or 
the variance of gene frequencies that would be found among many lines after one 
generation. Its effect is a dispersion of gene frequencies among the lines; in other 
words, the lines come to differ in gene frequency, though the mean in the popula¬ 
tion as a whole remains unchanged. 

In the next generation the sampling process is repeated, but each line now starts 
from a different gene frequency and so the second sampling leads to a further disper¬ 
sion. The variance of the change now differs among the lines, since it depends on 
the gene frequency q x in the first generation of each line separately. The effect of 
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Fig. 3.2. Random drift of a colour gene (‘black’) in Tribolium. Heterozygotes were 
recognizable, so the gene frequencies were estimated exactly by counting. The figure shows the 
results with two population sizes, iV =10 and N = 100. There were 12 lines with each population 
size. Natural selection favoured the wild-type allele and led to an overall increase in its 
frequency, random drift causing variation of the lines around the mean, more marked in the 
smaller than in the larger populations. (After Rich, Bell, and Wilson, 1979.) 


continued sampling through successive generations is that each line fluctuates irregu¬ 
larly in gene frequency, and the lines spread apart progressively, thus becoming 
differentiated. These are the first two consequences of the process, and they are 
exemplified in Fig. 3.2. If there were only one small population or line, one would 
see only the random drift in the erratic changes of gene frequency from generation 
to generation. Having several lines, as in Fig. 3.2, one sees random drift in each 
line and also the progressive differentiation between them as they drift apart. The 
differentiation between lines is more clearly seen in Fig. 3.3, from a different 
experiment, showing the distributions of gene frequency in successive generations. 

Increasing differentiation among the lines is equivalent to increasing variance of 
the gene frequency among them. The variance of the gene frequency, oq, among 
the lines, at any generation t, is given by 


°q~ P(/l0 1 



. . . [3.2] 


(The derivation of this expression will be explained later because it is more easily 
understood by consideration of the inbreeding aspect of the process.) Figure 3.4 
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Number of bw 15 genes 

Fig. 3.3. Distributions of gene frequencies in 19 consecutive generations among 105 lines of 
Drosophila melanogaster, each of 16 individuals. The gene frequencies refer to two alleles at the 
‘brown’ locus (bw 15 and bw), with initial frequencies of 0.5. The height of each black column 
shows the number of lines having the gene frequency shown on the scale below, previously fixed 
lines being excluded. {After Buri, 1956.) 


shows the variance of gene frequencies observed in the experiment of Fig. 3.3, with 
the expected variance calculated from equation [3.2]. We may note here a fact that 
will be needed later, and is obvious from equation [3.2], namely that oj, = o 2 q . 

Examination of the distributions of gene frequencies in Fig. 3.3 shows that the 
distributions change in shape, becoming eventually quite flat, with all frequencies 
equally probable. (This is not true of the limiting frequencies of 0 and 1, which 
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Fig. 3.4. Variance of gene frequencies among lines in the experiment illustrated in Fig. 3.3. The 
circles are the observed values, and the smooth curve shows the expected variance as given by 
equation [3.2]. The value taken for N is 11.5, which is the ‘effective number’, N t , as explained in 
the next chapter. (Data from Buri, 1956.) 

are discussed in the next section.) Theoretical considerations show that there are 
two phases in the dispersion. During the initial phase the gene frequencies are 
spreading out from the initial value. This phase is followed by a steady phase, when 
the gene frequencies are evenly spread out over the range and all gene frequencies 
except the two limits are equally probable. This uniform distribution of the steady 
phase is attained even if the initial gene frequency is not 0.5, though it takes longer 
to reach it. The duration, in generations, of the initial phase is a small multiple of 
the population size, depending on the initial gene frequency. With q 0 = 0.5 it lasts 
about 2 N generations, and with q 0 = 0.1 it lasts about 4N generations (Kimura, 
1955). The theoretical distributions of gene frequency during the initial phase, with 
q 0 = 0.5 and qo = 0.1, are shown in Fig. 3.5. The observed distributions in Fig. 
3.3 agree well with the theoretical distributions for q 0 = 0.5. 

Fixation 

There are limits to the spreading apart of the lines that can be brought about by 
the dispersive process. The gene frequency cannot change beyond the limits of 0 
or 1, and sooner or later each line must reach one or other of these limits. Moreover, 
the limits are ‘traps’ or points of no return, because once the gene frequency has 
reached 0 or 1 it cannot change any more in that line. When a particular allele has 
reached a frequency of 1 it is said to be fixed in that line, and when it reaches a 
frequency of 0 it is lost. When an allele reaches fixation, no other allele can be pre¬ 
sent in that line, and the line may then be said to be fixed. When a line is fixed, 
all individuals in it are of identical genotype with respect to that locus. Eventually 
all lines, and all loci in a line, become fixed. The individuals of a line are then 
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Fig. 3.5. Theoretical distributions of gene frequencies among lines, after different numbers of 
generations, t, expressed in terms of the population size of the lines, N. In the left-hand figure 
^0 = 0.5; on the right <? 0 = 0.1. Previously fixed lines (see next section) are excluded. The 
horizontal scale is the gene frequency, q, in any line. The vertical axis is the probability, scaled to 
make the area under each curve equal to the proportion of unfixed lines. (After Kimura, 1955.) 


genetically identical, and this is the basis of the genetic uniformity of highly inbred 
strains. 

When the process has gone to completion and all lines are fixed, the mean gene 
frequency is still unchanged and equal to the initial gene frequency. Therefore the 
proportion of the lines in which different alleles at a locus are fixed is equal to the 
initial frequencies of the alleles. If the base population contains two alleles A! and 
A 2 at frequencies p 0 and q Q respectively, then Aj will be fixed in the proportion 
Pq of the lines, and A 2 in the remaining proportion, q 0 . The variance of the gene 
frequency among the lines is then poqo, as may be seen from equation [3.2] by 
putting t equal to infinity. (In Fig. 3.3 the lines in which fixation or loss has just 
occurred are shown, but not those in which it occurred earlier.) 

When concerned with the attainment of genetic uniformity one wants to know how 
soon fixation takes place; what is the probability of a particular locus being fixed, 
or what proportion of all loci in a line will be fixed, after a certain number of genera¬ 
tions. Consideration of the progressive nature of the dispersion, as illustrated in Fig. 
3.3, will show that fixation does not start immediately; the dispersion of gene fre¬ 
quencies must proceed some way before any line is likely to reach fixation. To deduce 
the probability of fixation is mathematically complicated and only an outline of the 
conclusions can be given here. By the time the uniform distribution of the steady 
phase has been reached, a locus with an initial gene frequency of q 0 — 0.5 will have 
been fixed in about 50 per cent of lines, while a focus starting at q 0 = 0.1 will have 
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been fixed in over 90 per cent of lines. After the steady phase has been reached fix¬ 
ation proceeds at a constant rate: a proportion 1/A of the lines previously unfixed 
becomes fixed in each generation. After the earliest stages of fixation, the proportion 
of lines in which a gene with initial frequency q Q is expected to be fixed, lost, or 
to be still segregating is approximately as follows (Wright, 1952): 

fixed: q 0 - 3p 0 q 0 P 'j / { w 

lost: Pq - 3poq Q P > where P = (1-— r ) ... [3.3] 

neither: 6p 0 ^ 0 P J ' 2N / 

Figure 3.6 shows the progress of fixation and loss in an experiment with Drosophila. 
The expectation calculated from equation [3.3] fits the data very well after about 
generation 8, when 5 per cent of lines had been fixed. 

Genotype frequencies 

Change of gene frequency leads to change of genotype frequencies; so the genotype 
frequencies in small populations follow the changes of gene frequency resulting from 
the dispersive process. In the idealized population, which we are still considering, 
mating is random within each of the lines. Consequently the genotype frequencies 
in any one line are the Hardy—Weinberg frequencies appropriate to the gene fre¬ 
quency in the previous generation of that line. (There is, in fact, a small deviation 



Fig. 3.6. Fixation and loss occurring among 107 lines of Drosophila melanogaster , during 19 
generations. This is not the same experiment as that illustrated in Figs. 3.3 and 3.4, but was 
similar in nature. There were 16 parents per generation in each line, and the effective number 
(see Ch. 4) was 9. The closed circles show the percentage of lines in which the bw 75 allele has 
become fixed; the open circles show the percentage in which it has been lost and the bw allele 
fixed. The smooth curve is the expected amount of fixation of one or other allele, computed from 
the effective number by equation [3.3], (Data from Buri, 1956.) 
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from Hardy-Weinberg frequencies within lines, which will be explained at the end 
of this chapter, but it can be ignored for the moment.) As the lines drift apart in 
gene frequency they become differentiated also in genotype frequencies. But dif¬ 
ferentiation is not the only aspect of the change: the general direction of the change 
is toward an increase of homozygous, and a decrease of heterozygous, genotypes. 
The reason for this is the dispersion of gene frequencies from intermediate values 
toward the extremes. Heterozygotes are most frequent at intermediate gene frequen¬ 
cies (see Fig. 1.1), so the drift of gene frequencies toward the extremes leads, on 
the average, to a decline in the frequency of heterozygotes. It also leads to a higher 
proportion of the individuals in a line having the same genotype, and so to an increased 
genetic uniformity within lines. 

The genotype frequencies in the population as a whole can be deduced from a 
knowledge of the variance of gene frequencies in the following way. If an allele 
has a frequency q in one particular line, homozygotes of that allele will have a fre¬ 
quency of q 2 in that line. The frequency of these homozygotes in the population 
as a whole will therefore be the mean value of q 2 over all lines. We shall write 
this mean frequency of homozygotes as (q 2 ). The value of [q 2 ) can be found from 
a knowledge of the variance of gene frequencies among the lines, by noting that 
the variance of a set of observations is found by deducting the square of the mean 
from the mean of the squared observations. Thus 


and 


= (< 7 2 ) ~ q 2 
(q 2 ) = q 2 + o 2 q 


■ . . [3.4] 


where a 2 is the variance of gene frequencies among the lines, as given in equation 
[3.2], and q 2 is the square of the mean gene frequency. Since the mean gene fre¬ 
quency is equal to the original q 0 , it follows that q 2 or ql is the original frequency 
of homozygotes in the base population. Thus in the population as a whole the fre¬ 
quency of homozygotes of a particular allele increases, and is always in excess of 
the original frequency by an amount equal to the variance of the gene frequency 
among the lines. In a two-allele system the same applies to the other allele, and the 
frequency of heterozygotes is reduced correspondingly. Noting from equation [3.2] 
<J 2 P = a 2 , we therefore find the genotypic frequencies for a locus with two alleles 
as follows: 


Genotype 

Frequency in whole 
population 

A,A, 

pi + 

A]A 2 

2 Po q 0 - lo 2 q 

a 2 a 2 

ql + < 


... [3.5] 


These genotype frequencies are no longer the Hardy—Weinberg frequencies approp¬ 
riate to the original or mean gene frequency. The Hardy—Weinberg relationships 
between gene frequency and genotype frequencies, though they hold good within 
each line separately, do not hold if the lines are taken together and regarded as a 
single population. This fact causes some difficulty in relating gene and genotype 
frequencies in natural populations, because they are often more or less subdivided 
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Fig. 3.7. Change of frequency of heterozygotes among 105 lines of Drosophila melanogaster, 
each with 16 parents. The same experiment as is illustrated in Figs. 3.3. and 3.4. The frequency of 
heterozygotes refers to the population as a whole, all lines taken together. The smooth curve is 
the expected frequency of heterozygotes. (Data from Buri, 1956.) 


and the degree of subdivision is seldom known. An example of the decrease of 
heterozygotes resulting from the dispersion of gene frequencies is shown in Fig. 3.7. 

The foregoing account of genotype frequencies describes the situation in terms 
of one locus in many lines. It can be regarded equally as referring to many loci 
in one line; then the change in any one line or small population is an increase in 
the number of loci at which individuals are homozygous and a corresponding decrease 
in the number at which they are heterozygous — in short, an increase of homozygotes 
at the expense of heterozygotes. This change of genotype frequencies resulting from 
the dispersive process is the genetic basis of the phenomenon of inbreeding depres¬ 
sion, of which a full explanation will be found in Chapter 14. 

We have now surveyed the general nature of the dispersive process and its four 
major consequences — random drift, differentiation of sub-populations, genetic 
uniformity within sub-populations, and overall increase in the frequency of 
homozygous genotypes. Let us now look at the process from another viewpoint, 
as an inbreeding process. Instead of regarding the increase of homozygotes as a con¬ 
sequence of the dispersion of gene frequencies, we shall now look directly at the 
manner in which the additional homozygotes arise. 

Inbreeding 

Inbreeding means the mating together of individuals that are related to each other 
by ancestry. That the degree of relationship between the individuals in a population 
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depends on the size of the population will be clear by consideration of the numbers 
of possible ancestors. In a population of bisexual organisms every individual has 
two parents, four grand-parents, eight great-grandparents, etc., and t generations 
back it has 2' ancestors. Not very many generations back, the number of individuals 
required to provide separate ancestors for all the present individuals becomes larger 
than any real population could contain. Any pair of individuals must therefore be 
related to each other through one or more common ancestors in the more or less 
remote past; and the smaller the size of the population in previous generations the 
less remote are the common ancestors, or the greater their number. Thus pairs mating 
at random are more closely related to each other in a small population than in a 
large one. This is why the properties of small populations can be treated as the con¬ 
sequences of inbreeding. 

The essential consequence of two individuals having a common ancestor is that 
they may both carry replicates of one of the genes present in the ancestor; and if 
they mate they may pass on these replicates to their offspring. Thus inbred individuals 

that is to say, offspring produced by inbreeding — may carry two genes at a locus 
that are replicates of one and the same gene in a previous generation. Consideration 
of this consequence of inbreeding shows that there are two sorts of identity among 
allelic genes, and two sorts of homozygote. The sort of identity we have hitherto 
considered is a functional identity. If two genes cannot be distinguished by their 
phenotypic effects, or by any other functional criterion, they are regarded as being 
the same allele. An individual carrying a pair of such genes is a homozygote in the 
ordinary sense. The new sort of identity is one of origin by replication. Two genes 
that have originated from the replication of one single gene in a previous generation 
may be called identical by descent, or simply identical. Two genes that are not iden¬ 
tical are independent in descent. Homozygotes of identical genes may be called iden¬ 
tical homozygotes. Other terms in use are autozygous to describe identical 
homozygotes and allozygous to describe homozygotes that are not known to be 
autozygous. It is the production of identical homozygotes that gives rise to the in¬ 
crease of homozygotes as a consequence of inbreeding. 

Identity by descent provides the basis for a measure of the dispersive process, 
through the degree of relationship between the mating pairs. The measure is the 
coefficient of inbreeding, which is the probability that the two genes at any locus 
in an individual are identical by descent. It refers to an individual and expresses 
the degree of relationship between the individual’s parents. If the parents of any 
generation have mated at random then the coefficient of inbreeding of the progeny 
is the probability that two gametes taken at random from the parent generation carry 
identical genes at a locus. This is the average coefficient of inbreeding of all the 
progeny. Individuals of different families will have different inbreeding coefficients 
because with random mating some pairs of parents will be more closely related than 
other pairs. It is, however, with the average coefficient of inbreeding that we are 
concerned as a measure of the dispersive process. The coefficient of inbreeding is 
generally symbolized by F. 

The degree of relationship expressed in the inbreeding coefficient is essentially 
a comparison between the population in question and some specified or implied base 
population. Without this point of reference it is meaningless, as the following con¬ 
sideration will show. On account of the limitation in the number of independent 
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ancestors in any population not infinitely large, all genes now present at a locus 
in the population would be found to be identical by descent if traced far enough 
back into the remote past. Therefore the inbreeding coefficient only becomes mean¬ 
ingful if we specify some time in the past beyond which ancestries will not be pursued, 
and at which all genes present in the population are to be regarded as independent 
— that is, not identical by descent. This point is the base population and by its defini¬ 
tion it has an inbreeding coefficient of zero. The inbreeding coefficient of a subse¬ 
quent generation expresses the amount of the dispersive process that has taken place 
since the base population, and compares the degree of relationship between the in¬ 
dividuals now, with that between individuals in the base population. Reference to 
the base population is not always explicitly stated, but is always implied. For exam¬ 
ple, we can speak of the inbreeding coefficient of a population subdivided into lines. 
The comparison of relationship is between the individuals of a line and individuals 
taken at random from the whole population. The base population implied is a 
hypothetical population from which all the lines were derived. 


Inbreeding in the idealized population 

Let us now return to the idealized population and deduce the coefficient of inbreeding 
in successive generations, starting with the base population and its progeny con¬ 
stituting generation 1. The situation may be visualized by thinking of a hermaphrodite 
marine organism, capable of self-fertilization, shedding eggs and sperm into the sea. 
There are N individuals, each shedding equal numbers of gametes which unite at 
random. All the genes at a locus in the base population have to be regarded as being 
non-identical; so, considering only one locus, among the gametes shed by the base 
population there are 2 N different sorts, in equal numbers, bearing the genes A,, 
A 2 , A 3 , etc., at the A locus. The gametes of any one sort carry identical genes; 
those of a different sort carry genes of independent origin. What is the probability 
that a pair of gametes taken at random carry identical genes? This is the inbreeding 
coefficient of generation 1. Any gamete has a (l/2/V)th chance of uniting with another 
of the same sort, so 1/2 N is the probability that uniting gametes carry identical genes, 
and is thus the coefficient of inbreeding of the progeny. Now consider the second 
generation. There are now two ways in which identical homozygotes can arise, one 
from the new replication of genes and the other from the previous replication. The 
probability of newly replicated genes coming together in a zygote is again 1/2/V. 
The remaining proportion, 1 — 1/2 N, of zygotes carry genes that are independent 
in their origin from generation 1, but may have been identical in their origin from 
generation 0. The probability of their identical origin in generation 0 is what we 
have already deduced as the inbreeding coefficient of generation 1. Thus the total 
probability of identical homozygotes in generation 2 is 


Fi 


1 

2 N 


+ 


1 


1 

IN 




where F x and F 2 stand for the inbreeding coefficients of generations 1 and 2 respec¬ 
tively. The same argument applies to subsequent generations, so that in general the 
inbreeding coefficient of individuals in generation t is 
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F> ~ W + ( l ~ • [3 - 61 

Thus the inbreeding coefficient is made up of two parts: an ‘increment’, 1/2 N t at¬ 
tributable to the new inbreeding, and a ‘remainder’, attributable to the previous in- 
breeding and having the inbreeding coefficient of the previous generation. In the 
idealized population the ‘new inbreeding’ arises from self-fertilization, which brings 
together genes replicated in the immediately preceding generation. Exclusion of self- 
fertilization simply shifts the replication one generation further back, so that the ‘new 
inbreeding’ brings together genes replicated in the grandparental generation; the coef¬ 
ficient of inbreeding is affected, but not very much, as we shall see later. The distinc¬ 
tion between ‘new’ and ‘old’ inbreeding brings clearly to light a point which we 
note here in passing because it will be needed later and is often important in prac¬ 
tice. if there is no new inbreeding , as would happen if the population size were 
suddenly increased, the previous inbreeding is not undone, but remains where it 
was before the increase of population size. 

Let us call the ‘increment’ or ‘new inbreeding’ AF, so that 


A F = 


1 

2 N 


■ . . [3.7] 


Equation [3.6] may then be rewritten in the form 

F t = AF + (1 - A F)F t _ x ... [3.8] 

Further rearrangement makes clearer the precise meaning of the ‘increment’ AF. 


AF = 


F t ~ Ft- 1 
1 - F t _ i 


[3.9] 


From the equation written thus we see that the ‘increment’ AF measures the rate 
of inbreeding in the form of a proportionate increase. It is the increase of the in- 
breeding coefficient in one generation, relative to the distance that was still to go 
to reach complete inbreeding. This measure of the rate of inbreeding provides a 
convenient way of going beyond the restrictive simplifications of the idealized popula¬ 
tion, and it thus provides a means of comparing the inbreeding effects of different 
breeding systems. When the inbreeding coefficient is expressed in terms of AF, equa¬ 
tion [3.8] is valid for any breeding system and is not restricted to the idealized popula¬ 
tion, though only in the idealised population is AF equal to 1/2 N. 

So far we have done no more than relate the inbreeding coefficient in one genera¬ 
tion to that of the previous generation. It remains to extend equation [3.8] back to 
the base population and so express the inbreeding coefficient in terms of the number 
of generations. This is made easier by the use of a symbol P for the complement 
of the inbreeding coefficient 1 — F, which is known as the panmictic index. Substitu¬ 
tion of (1 — P ) for F in equation [3.9], and rearrangement, leads to 


F t 

P t -1 


= 1 - AF 


... [3.10] 
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Thus the panmictic index is reduced by a constant proportion in each generation. 
Extension back to generation t — 2 gives 

fr = (1 - AF) 2 

-*t —2 

and extension back to the base population gives 

P t = (1 - A Fyp 0 ... [3.11] 

where P 0 is the panmictic index of the base population. The base population is 
defined as having an inbreeding coefficient of 0, and therefore a panmictic index 
of 1. The inbreeding coefficient in any generation t , referred to the base population, 
is therefore 


F, = 1 — (1 — AF) ! ... [3.12] 

The consequences of the dispersive process were described earlier from the view¬ 
point of sampling variance. Let us now look again at them, applying the rate of 
inbreeding and the inbreeding coefficient as measures of the process. Strictly speaking 
we should refer still to the idealized population, but the equating of the two viewpoints 
is generally valid, unless the population size is different in parent and offspring genera¬ 
tions or there is non-random mating within lines. 

Variance of gene frequency 

First, the variance of the change of gene frequency in one generation, taken from 
equation [3.1] and expressed in terms of the rate of inbreeding, becomes 

oh = —■ = Po?oAF ■ ■ ■ P-13] 

2N 

An equivalent way of writing equation [3.13] is in terms of the inbreeding coeffi¬ 
cient and the variance of gene frequencies after one generation. It follows that the 
relationship is the same after any number of generations, so that after t generations 

0 2 q = PoqoFt • • • t 3 ' 14 ] 

Equation [3.14] can be shown to be equivalent to equation [3.2], which was given 
without explanation. Replacing F, in equation [3.14] by [1 — (l/2A) r ] from equa¬ 
tions [3.12] and [3.7] gives equation [3.2]. 

Measures of the dispersive process based on inbreeding are more useful than those 
based on the variance of gene frequencies because they apply equally to any mean, 
or initial, gene frequency. Thus A F expresses the rate of dispersion, and F the 
cumulated effect of random drift. 

Genotype frequencies 

Let us consider next the genotype frequencies in the population as a whole. The 
genotype frequencies expressed in terms of the variance of gene frequency in equa¬ 
tion [3.5] can be rewritten in terms of the coefficient of inbreeding from equation 
[3.14], The frequency of A 2 A 2 , for example, is 
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(?) = ql + o\ = q\ + poqoF 

The genotype frequencies expressed in this way are entered in the left-hand side 
of Table 3.1. As was explained before, this way of writing the genotype frequencies 
shows how the homozygotes increase at the expense of the heterozygotes. Recogni¬ 
tion of identity by descent to which the inbreeding viewpoint led us means that we 
can now distinguish the two sorts of homozygote, identical and independent, among 
the AjAi and A 2 A 2 genotypes. The frequency of identical homozygotes among both 
genotypes together is by definition the inbreeding coefficient, F; and the division 
between the two genotypes is in proportion to the initial gene frequencies. So PqF 
is the frequency of AjAj identical homozygotes, and q^F that of A 2 A 2 identical 
homozygotes. The remaining genotypes, both homozygotes and heterozygotes, carry 
genes that are independent in origin and are therefore the equivalent of pairs of 
gametes taken at random from the population as a whole. Their frequencies are 
therefore the Hardy-Weinberg frequencies. Thus, from the inbreeding viewpoint, 
we arrive at the genotype frquencies shown in the right-hand columns of Table 3.1. 
This way of writing the genotype frequencies shows how homozygotes are divided 
between those of independent and those of identical origin. The equivalence of the 
two ways of expressing the genotype frequencies can be verified from their algebraic 
identity. Both ways show equally clearly how the heterozygotes are reduced in 
frequency in proportion to 1 — F. 

The panmictic index, which was defined earlier as P = 1 — F, expresses the 
frequency of heterozygotes in a subdivided population relative to the Hardy - 
Weinberg frequency expected if the population as a whole mated at random. This 
can be seen by consideration of the frequency of heterozygotes given in Table 3.1. 
Let H t and H 0 be the frequencies of heterozygotes in a subdivided and random¬ 
mating population respectively. Then H 0 = 2p 0 q 0 and H, = 2p 0 q 0 (l - F) 
= H 0 { 1 — F). The panmictic index at generation t is therefore 


P t = 1 


- F t = 


A 

H 0 


[3.15] 


When a real population is sampled, a deficiency of heterozygotes may be the only 
indication that it is a subdivided population. The observed frequency of heterozygotes, 
H, relative to the Hardy—Weinberg frequency, 2pq, then gives the panmictic index 


Table 3.1 Genotype frequencies for a locus with two alleles, expressed in terms of 
the inbreeding coefficient F. 



Original 

frequencies 

Change 
due to 
inbreeding 

Origin: 


Independent 

Identical 

A, A, 

„ 2 

Po 

+ PrtaF 

= PoO ~ F) 

+ Po F 

AjA 2 

2 PaQo 

- 2p 0 qoF 

= 2 P<flo(l ~ F) 


a 2 a 2 

Q 2 o 

+ p 0 qoF 

= ql(J ~ F) 

+ qoF 
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as P = Htlpq, p and q being the observed gene frequencies in the sample as a whole. 
Caution is needed, however, in regarding the value of P so calculated as anything 
more than a description of the sample. It is unlikely that all the sub-populations that 
really existed would be equally represented in the sample and, unless they were, 

P could not validly be used to estimate, for example, the variance of the gene 
frequency among the sub-populations by equation [3.14]. 

Genotype frequencies within lines. Throughout this chapter it has been assumed 
that mating is at random within lines, and that consequently the genotype frequen¬ 
cies within any line are the Hardy-Weinberg frequencies appropriate to the gene 
frequency in that line.' It was pointed out, however, that the genotype frequencies 
actually deviate slightly from the Hardy-Weinberg expectations. The reason for 
this deviation is that the sample of genes passed to the next generation consists, in 
fact, of two independent samples, one in male parents and the other in female parents, 
with N genes in each sample. The male and female parents therefore differ, on 
average, in their gene frequencies. A difference in gene frequency between male 
and female parents leads to an excess of heterozygotes in the progeny; or, in other 
words, an expectation calculated from the mean gene frequency is too low. By put¬ 
ting appropriate gametic frequencies in Table 1.2 it can be shown that the expected 
frequency of heterozygotes within any line is H = 2 pq + j D 2 , where p and q are 
the mean gene frequencies in the line,_D is the difference in gene frequency be¬ 
tween male and female parents, and D 2 is the mea n sq uared difference. Since 
~[> = 0 it follows, by analogy with equation [3.4], that D = o 2 D . The variance a D 
is the variance of the difference between two binomial samples of size N , which 
is 2 pqlN. Thus the expected frequency of heterozygotes within lines is 

H = 2pq + pq/N 

= 2w(' + -in) 

(For further details see Robertson, 1965). The excess of heterozygotes is trivial unless 
N, the number of parents of the sample, is very small. But it can have an appreciable 
effect if the frequencies observed in a small sample of a single population are tested 
for agreement with Hardy—Weinberg expectations. 

The overall frequency of heterozygotes in the whole of a subdivided population 
is sometimes used to estimate the amount of inbreeding in the history of the popula¬ 
tion, rather than as a description of the present state of subdivision. This is done 
in Example 4.1. If the lines are separately identifiable, and the number of parents 
sampled in each line is known, correction can be made for the excess of heterozygotes. 
Substitution of H from equation [3.16] for H 0 in equation [3.15] gives 



H 

1 “ F ~ 2pq(\ + 1/2A) ...[3.17] 

where H is the observed frequency of heterozygotes, p and q are the overall observed 
gene frequencies, and N is the number of parents in each line. 
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Problems 

In working the problems on Chapter 3, treat the populations as if they were idealized 
populations. 

3.1 Codfish have two forms of haemoglobin determined by alleles a and b at one 
locus. A sample of cod taken off the Norwegian coast had the following frequencies 
of the three genotypes. 

aa ab bb Total 

130 763 1698 2591 

Are these frequencies compatible with the sample having been drawn from a random¬ 
breeding population? What do they suggest about the breeding structure of the 
population? 

Data from Moller, D. (1968) Hereditas , 60, 1-32. 


[Solution 3] 

3.2 Among the cod described in Problem 3.1 two distinct races can be recognized 
by anatomical differences in the otoliths. When the sample was separated into the 
two races, called ‘Arctic’ and ‘Coastal’, the following numbers were found. 



aa 

ab bb 

Total 

Arctic 

23 

250 946 

1219 

Coastal 

107 

513 752 

1372 

What further 

light does this throw on 

the question in Problem 3.1? 


[Solution 13] 

3.3 If a population is maintained by random mating among 20 pairs of parents in 
every generation, what will be its inbreeding coefficient after 5 and after 10 
generations? 


[Solution 23] 

3.4 Suppose that for a class experiment each student was given 10 pairs of un¬ 
mated Drosophila taken at random from a large stock in which an electrophoretic 
variant was present at a gene frequency of 0.3. Each student then maintained his 
sub-population by taking 10 pairs at random to be parents of the next generation. 
After 5 generations each student determined the gene frequency in his own popula¬ 
tion by electrophoresis of a sample of 20 flies from the progeny. What would be 
the average gene frequency found? How much variation would you expect to find 
among the students in their estimated gene frequencies, assuming that all read their 
gels correctly? 


[Solution 33] 

3.5 If the numbers of the three genotypes counted by the students in the experi- 
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ment of Problem 3.4 were put together, what would be the overall frequencies of 
the genotypes? 

[Solution 43] 

3.6 A stock of mice consisted of 18 lines all derived from the same base popula¬ 
tion but bred separately thereafter. The stock was polymorphic for an autosomal 
enzyme locus, Got-1, with two alleles, a and b. After 27 generations mice from 
all the lines were typed by electrophoresis for the genotypes at this locus and the 
following numbers were found. 

aa ab bb ' Total 

42 76 448 566 

What is the inbreeding coefficient indicated by these numbers? 

Data from Garnett, I. (1973) Ph.D. Thesis, University of Edinburgh. 

{Solution 53] 

3.7 Suppose that a random-breeding population is sampled and the following 
genotype frequencies of a protein variant are found. 

aa ab bb 

0.34 0.52 0.14 

(1) Ignoring the question of significance, do these frequencies give evidence of some 
form of selection operating on the genotypes? (2) How would the conclusion be altered 
by the knowledge that the individuals in the sample were the progeny of 4 pairs 
of parents? 

[Solution 63] 

3.8 Modify equation [3.16] so as to be applicable when there are different numbers 
of male and female parents, as is usually the case with domestic livestock. 

[Solution 73] 



A SMALL POPULATIONS: 

■ II. Less Simplified Conditions 


In order to simplify the description of the dispersive process we confined our atten¬ 
tion in the last chapter to an idealized population, and to do this we had to specify 
a number of restrictive conditions, which could seldom be fulfilled in real popula¬ 
tions. The purpose of this chapter is to adapt the conclusions of the last chapter to 
situations in which the conditions imposed do not hold; in other words, to remove 
the more serious restrictions and bring the conclusions closer to reality. The restric¬ 
tive conditions were of two sorts, one sort being concerned with the breeding struc¬ 
ture of the population and the other excluding mutation, migration, and selection 
from consideration. We shall first describe the effects of deviations from the idealized 
breeding structure, and then consider the outcome of the dispersive process when 
mutation, migration, or selection are operating at the same time. 

Effective population size 

If the breeding structure does not conform to that specified for the idealized popula¬ 
tion, it is possible to evaluate the dispersive process in terms of either the variance 
of gene frequencies or the rate of inbreeding. This can be done by the same general 
methods and no new principles are involved. We shall therefore give the conclu¬ 
sions briefly and without detailed explanation. The most convenient way of dealing 
with any particular deviation from the idealised breeding structure is to express the 
situation in terms of the effective number of breeding individuals, or the effective 
population size , N e . This is the number of individuals that would give rise to the 
calculated sampling variance, or rate of inbreeding, if they bred in the manner of 
the idealized population. Suppose, for example, that the rate of inbreeding, A F, had 
been calculated for a particular breeding structure from consideration of the prob¬ 
ability of identical homozygotes being produced. In the idealized population, A F 
is related to the population size N by equation [3.7] as A F = M2N. The effective 
size is related to A F in the same way and would therefore be obtained from the 
calculated A F as N e = 1/2A F. Thus all the conclusions drawn in the previous 
chapter are valid for any breeding structure, and the formulae deduced can be applied, 
if the effective number N e is substituted for the actual number N. When the breeding 
structure is known, the effective number can be derived from the actual number, 
and the relationships between the two are given below for the most common depar¬ 
tures from the idealized breeding structure. The exact expressions are often 
complicated, but in most circumstances a simple approximation can be used with 
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sufficient accuracy. It is important to note that in these relationships the actual number 
N is the number of breeding individuals, and it therefore cannot be obtained from 
a census, unless the different age-groups are distinguished. Knowing the effective 
population size N e for any breeding structure, one can then obtain the rate of 
inbreeding as 


A F 


1 

2N~e 


... [4.1] 


and from A F any of the consequences of inbreeding can be calculated by the form¬ 
ulae of the previous chapter. 


Exclusion of closely related matings 

In bisexual organisms self-fertilization is, of course, impossible. Sib-mating is also 
excluded in man, and is often deliberately avoided in the maintenance of popula¬ 
tions of laboratory and domesticated animals. The exclusion of closely related 
matings, however, does not make a great deal of difference to the rate of inbreeding 
for the following reason. The progeny of a closely related mating have a higher 
coefficient of inbreeding than those of less closely related matings. Their presence 
therefore raises the average coefficient of inbreeding of the population at any time. 
But their higher inbreeding is not permanent: mating at random, they themselves 
are likely to mate with less closely related individuals, and so their higher-than- 
average inbreeding is not passed on to their progeny. Thus the exclusion of closely 
related matings reduces the average coefficient of inbreeding throughout, but it does 
not much affect the rate at which the inbreeding accumulates. The effect of the ex¬ 
clusions can be quantified approximately in the effective number as follows (Wright, 
1969 p. 212). 

With self-fertilization excluded, 


N e = N + k (approx.) . . . [4.2a] 

and so, by equation [4.1], 

A F = 1/(2 N + 1) (approx.) . . . [4.2 b] 

With sib-mating also excluded, 

N e - N + 2 (approx.) . . . [4.3a] 

and A F = 1/(2 N + 4) (approx.) ... [4.3 b] 

The approximations introduce very little error in calculating A F unless N is very 
small, as with close inbreeding; but then other methods of deducing AFare required, 
as will be explained in the next chapter. 


Different numbers of males and females 

In domestic and laboratory animals the sexes are often unequally represented among 
the breeding individuals, since it is more economical, when possible, to use fewer 
males than females. The two sexes, however, whatever their relative numbers, con¬ 
tribute equally to the genes in the next generation. Therefore the sampling variance 
attributable to the two sexes must be reckoned separately. Since the sampling variance 
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is proportional to the reciprocal of the number, the effective number is twice the 
harmonic mean of the numbers of the two sexes. It is twice the harmonic mean because 
the population size is N = N m + Nf, where N m and Nf are the numbers of males 
and females respectively. The harmonic mean is \![i (l!N m + so 


1 

K 

N e 


1 


+ 


1 


4N rn AN, 
4 NJf f 


(approx.) 


7 


N„ + N, 


(approx.) 


7 


The rate of inbreeding is then 


A F 


&N„ 


+ 


1 

8A^ 


(approx.) 


... [4.4 a] 
. . . [4.4 b] 


. .. [4.5] 


This gives a close enough approximation unless both N m and 7^are very small, as 
with close inbreeding. It should be noted that the rate of inbreeding depends chiefly 
on the numbers of the less numerous sex. For example, if a population were main¬ 
tained with an indefinitely large number of females but only one male in each genera¬ 
tion, the effective number would be only about 4. 


Unequal numbers in successive generations 

The rate of inbreeding in any one generation is given, as before, by 1/27V. If the 
numbers are not constant from generation to generation, then the mean rate of in- 
breeding is the mean value of 1/2 N in successive generations. The effective number 
is the harmonic mean of the numbers in each generation. Over a period of t genera¬ 
tions, therefore, 


N e 


i r 1 i i n 

+ — + — +... + — 
t L N \ N i n,_ 


(approx.) . . . [4.6] 


Thus the generations with the smallest numbers have the most effect. The reason 
for this can be seen by consideration of the new and old inbreeding referred to in 
connection with equation [3.6]. An expansion in numbers does not affect the previous 
inbreeding; it merely reduces the amount of new inbreeding. So, in a population 
with fluctuating numbers, the inbreeding proceeds by steps of varying amount, and 
the present size of the population indicates only the present rate of inbreeding. 


Non-random distribution of family size 

This is the moist important deviation from the breeding system of the idealized popula¬ 
tion. Its consequence is usually to render the effective number less than the actual, 
but in special circumstances it makes it greater. Family size means here the number 
of progeny of an individual that become breeding individuals in the next generation. 
It will be remembered that in the idealized population each breeding individual has 
an equal probability of contributing genes, or progeny, to the next generation. The 
contribution of progeny is randomly distributed among the parents, and family sizes 
vary. In real populations the parents seldom have an equal chance of contributing 
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progeny because they differ in fertility and in the survival of their progeny. This 
variation among parents leads to a greater variation of family size, and this has the 
consequence that a greater proportion of the next generation come from a smaller 
number of parents. The effective number is thus reduced. Conversely, the variation 
of family size may, by special breeding methods, be reduced below the random 
amount, with a consequent increase of the effective number. The relation of effec¬ 
tive number to variation of family size is, briefly, as follows. 

Attention will be restricted here to populations of constant size and with males 
and females in equal numbers. The mean family size k of all individuals, whether 
male or female, must then be 2 because to replace the population each individual 
must on average have 1 male and 1 female offspring represented among the parents 
of the next generation. Random variation of family size, as in the idealized popula¬ 
tion, gives rise to a binomial distribution which, unless N is very small, differs little 
from a Poisson distribution. A Poisson distribution has a variance equal to the mean 
so the variance of family size when parents have an equal chance of contributing 
to the next generation is V k = k = 2. When parents do not have an equal chance 
of contributing to the next generation, through differences of fertility or other reasons, 
the variance of family size is greater than 2. The way in which the variance of fam¬ 
ily size influences the effective number can be deduced by consideration of the prob¬ 
ability of a zygote being an identical homozygote, in a manner similar to that by 
which the inbreeding increment was deduced in the last chapter. The effective number 
is then obtained from the rate of inbreeding. The relationship to which this leads 
is approximately 


AN 

N e = - (approx.) . . . [4.7] 

V k + 2 

This reduces to N e = N for the idealized population in which V k = 2. The rela¬ 
tionship in equation [4.7] refers to monogamous mating, when V k is the same for 
both sexes. If males can mate with more than one female, V k is likely to be different 
for males and females. The effective number is then given by 


N. 


8 N 

V km + V k f + 4 


(approx.) 


. . . [4.8] 


where V km and V k j- are the variances of family sizes of males and females respec¬ 
tively (Hill, 1979). 

Variation in family size above the random amount, due to differences of reproduc¬ 
tive success t is the most important cause of N e being less than N, having a much 
larger effect than the other departures from the idealized population. There is some 
information on how much the effective number is reduced by this cause. The variance 
of family size V k has been estimated in a few species (see Crow and Kimura, 1970), 
from which the ratio N e /N can be calculated, N being the number of breeding in¬ 
dividuals, not the total census number. From the family sizes of women, NJN 
ranged from 0.69 to 0.94 in four sets of data; in the snail Lymnea it was 0.75; in 
Drosophila females it was 0.71 and in males 0.48. 

The reduction of N e from all causes can be estimated from the observed effect 
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of inbreeding on the variance of gene frequency among lines. The ratio of N e IN 
has been estimated in this way for Drosophila melanogaster with the sexes equal 
in actual numbers. The estimates for the sexes jointly from five experiments ranged 
from 0.56 to 0.83 (Kerr and Wright, 1954a, b; Wright and Kerr, 1954; Buri, 1956). 

Minimal inbreeding 

It is often desirable to keep stocks of laboratory animals with the least possible in- 
breeding. Increasing the number of breeding individuals N as much as possible is 
not the only thing that can be done. By choice of the individuals to be used as parents, 
the variance ofTamily size, V k , can be reduced below its random amount, and the 
effective number consequently increased. If the individuals are chosen equally from 
all families, then there is no variation in family size, and V k = 0. Substitution into 
equation [4.7] shows that the effective number then becomes N e = 2N, approxi¬ 
mately. The exact relationship is 

N e = IN - 1 ... [4.9] 

which is very nearly twice what it would be in an idealized population of the same 
size. Equation [4.9] refers to a population bred from equal numbers of males and 
females. Equalization of family size then means choosing two individuals from the 
progeny of each pair of parents. 

If the sexes are unequal in numbers, the variance of family size can be made zero 
by choosing as parents one male from each sire’s progeny and one female from each 
dam’s progeny. The rate of inbreeding is then given by the following formula (Gowe, 
Robertson, and Latter, 1959): 


A F 


3 

2>2N m 


+ 


32 N t 


... [4.10] 


where N m and Nf are the actual numbers of male and female parents respectively, 
and females are more numerous than males. 

The avoidance of matings between close relatives, such as sibs or cousins, seems 
at first sight to be an easy way of reducing the rate of inbreeding. This delays the 
first increment of inbreeding, but very little reduction of the subsequent rate of in- 
breeding is achieved. The reasons for this were explained earlier and equation [4.3a] 
gives the effective population size with self-fertilization and sibmating excluded. 
If family size is deliberately equalized then the avoidance of closely related matings 
achieves no further reduction in the rate of inbreeding (Robinson and Bray, 1965). 
The chief advantages of avoiding matings between close relatives are to make the 
rate of inbreeding more constant from generation to generation, and to make the 
inbreeding coefficients of individuals more uniform within generations. 


Overlapping generations 

In most natural populations, and in domesticated animals, the generations are not 
discrete but are overlapping. This means that the individuals present at any time 
are of different ages and at different stages of their life-cycles. Furthermore, in¬ 
dividuals differ in length of life and consequently in their opportunities for reproduc¬ 
tion. Differences of lifetime therefore add to differences of fertility in increasing 
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the variance of family size, the longer-lived individuals having a greater chance of 
contributing offspring to the next generation than the shorter-lived. The effect on 
N e is dealt with by equations [4.7] or [4.8]. There is, however, a problem in find¬ 
ing what is the total number per generation, i.e., TV in equation [4.7]. Provided the 
population has a stable age-structure the total number per generation can be found 
as follows. We need to know the number of individuals born within a specified time- 
interval, which might be one year or any convenient period. This number is the 
size of the cohort defined by the time-interval. The cohort size N c is related to the 
total number alive at any time, N T , i.e., the census count by N c = N T /E, where E 
is the expectation of life, or the mean age at death, expressed in units of the specified 
time-interval that defines the cohort (see Emigh and Poliak, 1979). We need to know 
also the generation length L in units of the specified time-interval, the generation 
length being the average age of parents at the birth of their offspring. Then the total 
number per generation is TV = N c L, and the effective number per generation is 
given approximately by 


4N C L 

N e = - (approx.) ... [4.11] 

V k + 2 

(Hill, 1979), where V k is the variance of family size from all causes. If males and 
females differ in numbers or in generation length, as is often the case with farm 
animals, equation [4.11] has to be modified in a manner explained by Hill (1979). 

The effective number in the human population of the USA has been estimated 
as N e = 0.41TV r , but the ratio is probably somewhat lower than this because the 
estimate did not take account of all the possible sources of variation of fertility (Emigh 
and Poliak, 1979). 

Example 4.1 Data from a mouse experiment (Garnett and Falconer, 1975, and unpub¬ 
lished) will serve to illustrate the use of several of the formulae deduced in this and the 
previous chapter. Furthermore, by calculating the effective population size independently 
from the variance and the inbreeding approaches, we can check on the validity of the 
theory. The population consisted of 18 lines, all originating from the same random-bred 
base and all maintained by minimal inbreeding with 8 pairs of parents mated in every 
generation (Falconer, 1973). The data consisted of gene and genotype frequencies at 
5 polymorphic enzyme loci in each of the lines. The enzyme loci are listed in the table. 
There were two alleles present at all the loci; all the heterozygotes were distinguishable 
and the gene frequencies were obtained by counting (equation [1.1]). At generation 27 
all the parents were typed, so the gene frequencies at that time were determined without 
error. For each locus, the variance of gene frequency among the 18 lines was calculated. 
The table gives, for each locus, the mean gene frequency, q, the variance of gene fre¬ 
quency, a 2 , and the overall frequency of heterozygotes in the population as a whole, 
H. There is no reason to think that the gene frequencies had changed from their values 
in the base population, so for calculations it is assumed that q Q = q. 

The calculations to be made are: (1) the effective population size N e , expected from 
the number of parents TV, and the breeding structure; (2) the inbreeding coefficient F, 
from the variance of gene frequencies a 2 q , and then N e from F at generation t = 27; 
(3) F at t = 27 from the frequency of heterozygotes H, and then N e from F again. 

1. With 8 pairs of parents, N = 16. With minimal inbreeding (V k = 0), equation [4.9] 
gives N e = 2N — 1 =31. Equation [4.1] gives AF = l/2N e — 0.0161, and equation 
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[3.12] gives F = 1 — (1 — 0.0161) 27 = 0.355. These expected values will be real¬ 
ized only if V k = 0 is achieved. In practice some pairs will inevitably be sterile, so V k 
will not be zero and N e will be less than 31. 

2. F is related to a 7 by equation [3.14]. Taking Dip-l as an example, F = o 2 q ipq = 
0.077/(0.236 X 0.764) = 0.427. Each locus gives an independent estimate of F. They 
are given in the table in the column headed F(2). The mean is F = 0.378. From this 
mean estimate of F, we get the rate of inbreeding AF from equation [3.12], By 
rearrangement, (1 — AF) f —1 — F n which with t = 27 yields AF = 0.0174. The 
effective population size N e is found from AF by equation [4.1]. This gives N e = 1/2AF 
= 28.7. 

3. Equal numbers of individuals were classified in all lines, so F can be estimated 
from the overall frequency of heterozygotes, H. This could be done by equation [3.15], 
but the number of parents per line is small enough to make the expected chance dif¬ 
ferences of gene frequencies in males and females not negligible. Allowance for this 
is made in equation [3.17], which gives (1 — F) = H!2pq(\ + V2N). Taking Dip-l 
again as an example, (1 — F) = 0.240/(2 x 0.236 x 0.764 x 1.03125) = 0.646, 
and F = 0.354. Again, each locus gives an independent estimate of F, as given in the 
column headed F(3). AF and N e are calculated from F in the same way as under calcula¬ 
tion 2 above, and the values are entered at the foot of the table. 


Locus 

q 


H 

F(2) 

F(3) 

Dip-l 

0.764 

0.077 

0.240 

0.427 

0.355 

Id-1 

0.370 

0.102 

0.301 

0.438 

0.374 

Gpi-1 

0.297 

0.072 

0.283 

0.345 

0.343 

Gpd-1 

0.215 

0.042 

0.253 

0.249 

0.273 

Got-2 

0.141 

0.052 

0.134 

0.429 

0.464 

Mean 




0.378 

0.362 

Method 



F 

AF 

K 

1. Expected from breeding structure 

0.355 

0.0161 

31 

2. Variance of gene frequency, o 2 q 

0.378 

0.0174 

28.7 

3. Frequency of heterozygotes, H 

0.362 

0.0165 

30.3 

4. Pedigrees 



0.379 

0.0175 

28.6 


The inbreeding coefficient can be calculated in yet another way — from the pedigree 
records, in a manner to be explained in the next chapter. This is an exact determination 
because it is based on the probabilities of identical homozygotes arising from the matings 
actually made. The calculation was made for each line, and the mean value was F = 
0.379. This gives AF = 0.0175 and N e = 28.6. The ratio of N e (achieved)/^ (expected) 
is 28.6/31 = 0.92, and the ratio of N e (achieved)/A is 28.6/16 = 1.79. Comparing the 
three estimates of N e shown at the foot of the table, we see that the estimates from the 
variance of gene frequencies and from the frequency of heterozygotes agree very well 
with the pedigrees. It may be noted that if the correction for unequal gene frequencies 
in male and female parents is not made, the estimate of N e from the frequency of 
heterozygotes is 32.7 instead of 30.3. 


Mutation, migration, and selection 

The description of the dispersive process given so far in this chapter and the previous 
one is conditional on the systematic processes of mutation, migration, and selection 
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being absent, and its relevance to real populations is therefore limited. So let us 
now consider the effects of the dispersive and systematic processes when acting 
jointly. The systematic processes, as we have seen in Chapter 2, tend to bring the 
gene frequencies to stable equilibria at particular values which would be the same 
for all populations under the same conditions. The dispersive process, in contrast, 
tends to scatter the gene frequencies away from these equilibrium values, and if not 
held in check by the systematic processes it would in the end lead to all genes being 
either fixed or lost in all populations not infinite in size. The tendency of the systematic 
processes to change the gene frequency toward its equilibrium value becomes stronger 
as the frequency deviates further from this value. For this reason there is a point 
of balance at which the dispersion of gene frequencies is held in check by the 
systematic processes. There is then a certain degree of differentiation between sub- 
populations which remains constant so long as the conditions remain unchanged. 
The problem is therefore to find the distribution of gene frequencies among the lines 
of a subdivided population when this steady state has been reached. It should be 
noted, however, that it may take a very long time after a change of conditions for 
a population to attain the steady state, so the distributions deduced are only approxi¬ 
mately applicable to real populations. The solution is complicated mathematically 
and only the main conclusions will be given. For a more detailed account, see Crow 
(1986). 

Non-recurrent neutral mutation 

We shall first consider briefly the fate of unique mutations that are effectively neutral 
with respect to fitness. The fate of such mutants forms the basis of the neutral theory 
of molecular evolution, of which a full account is given by Kimura (1983). What 
is the chance that an allelic substitution will occur at any particular locus by the 
process of mutation and random drift? An ‘allelic substitution’ means that the allele 
or alleles present now are all replaced by a new one at some time in the future. 
The number of representatives of each autosomal locus present at any time is 2 N, 
where N is the actual population size, and one of these will, in the absence of selec¬ 
tion, eventually become fixed. Therefore the chance that any particular one becomes 
fixed is 1/2 N. Let u be the neutral mutation rate at the locus in question; i.e., the 
probability that a new neutral allele appears by mutation in any one generation. The 
total number of new mutants at the locus is then 2 Nu, assuming that each new 
mutant is initially present in only one copy. For each mutant separately, the chance 
of fixation is 1/2 N. Therefore the probability that one or another of the new mutants 
becomes fixed is 2 Nu X 1/2 N = u. This is the probability of an allelic substitution 
at a locus occurring in any particular generation, and it is simply equal to the muta¬ 
tion rate per generation at the locus in question. To get the rate of substitution at 
all loci together, we put u equal to the neutral mutation rate per gamete, i.e., the 
frequency of gametes carrying a new neutral mutant at any locus. The reason why 
the rate of allelic substitution is independent of the population size is that in a larger 
population the larger number of new mutants is balanced by the smaller individual 
chance of survival. 

Selection, of course, increases or decreases the chance of fixation, according to 
whether the new mutant is favourable or unfavourable. The great majority of mutants 
are expected to be deleterious rather than beneficial. What is the chance that a 
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deleterious mutant gives rise to an allelic substitution? By the same reasoning, this 
is equal to the mutation rate to ‘effectively neutral’ alleles; and, according to Kimura 
(1983), an ‘effectively neutral’ allele is one with a coefficient of selection s against 
it in the range from 5 = 0 (i.e. strictly neutral) up to ^ = 1/2 N e . Thus effective 
neutrality depends on the effective population size, and an effectively neutral allele 
is one for which the product N^s is less than 1. 

The expectation that some deleterious mutants will become fixed by random drift 
means that all populations must tend to decline in natural fitness over a long period 
of time, unless the conditions that determine fitness change or the decline is counter¬ 
balanced by the occurrence and fixation of favourable mutants (see Kimura, 1983, 
p. 248). There are important practical consequences of this expectation. Inbred lines 
of laboratory animals are maintained by brother—sister matings for which the effec¬ 
tive population size is about 2.5 (equation [4.2a]). Deleterious mutants with a coef¬ 
ficient of selection of up to about 20 per cent could therefore become fixed and the 
consequent loss of reproductive fitness could be serious, unless counteracted by artific¬ 
ially intensified selection; favourable mutants are far too rare to counterbalance the 
loss of fitness in small populations. The fixation of a deleterious gene in a small 
population is illustrated later, in Example 4.2. 


Recurrent mutation and migration 

Recurrent mutation and migration can be dealt with together because they change 
the gene frequency in the same manner. Consider again a population subdivided 
into many lines, all with an effective size N e ; and let a proportion m of the breeding 
individuals of every generation in each line be immigrants coming at random from 
all other lines. Let u and v be the mutation rates in the two directions between two 
alleles at a locus. The state of dispersion between the sub-populations, when the 
balance between dispersion on the one hand and migration and mutation on the other 
is reached, can be expressed as the inbreeding coefficient as follows: 


F = 


_ 1 _ 

4 N e (u + v 4- m) + 1 


(approx.) 


... [4.12] 


If the mean gene frequencies were known, the state of dispersion could be expres¬ 
sed as the variance of gene frequency by putting a\ = Fpq, from equation [3.14]. 

The theoretical distributions of gene frequencies corresponding to four equilibrium 
values of F are shown in Fig. 4.1. These distributions are similar in general form 
to the distributions that a population goes through during the process of inbreeding 
without mutation or migration, shown in Fig. 3.5. The effect of mutation or migra¬ 
tion can be thought of as arresting the process at a point corresponding to some value 
of F or of o 2 q , the variance of gene frequency among sub-populations. The chief 
difference here is that if F goes beyond 0.33, when all gene frequencies, including 
fixation, are equally probable, the distribution becomes U-shaped, with more sub¬ 
populations being at the extremes and fewer at intermediate gene frequencies. The 
reason for this is that, with mutation or migration, fixation in any one line is not 
permanent. 

In order to see what equation [4.12] and the distributions in Fig. 4.1 mean, let 
us consider these questions: at what value of F will the population stabilize if there 
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Fig. 4.1. Theoretical distributions of gene frequency among sub-populations, when dispersion is 
balanced against mutation or migration, and the mean gene frequency is 0.5. The vertical axis is 
the probability, as in Fig. 3.5. The states of dispersion to which the curves refer are indicated by 
the values of Fin the figure. The values of N e m are the numbers of immigrants per generation, as 
explained in the text. (Based on Wright, 1951.) 


is mutation at the known rates, but no migration? and: how much migration, with 
no mutation, would be needed to produce the distributions shown? For this purpose 
it will be accurate enough to take equation [4.12] as N e = 1/4 F(u + v + m). 
Substitute in this the usual mutation rate of u + v = 10 -5 , with m = 0, and F — 
0.005 corresponding to the least dispersed distribution in Fig. 4.1. This gives N e 
= 5 x 10 6 , which means that sub-populations of size 5 million would differen¬ 
tiate as far as F = 0.005 before being stabilized by mutation. Smaller sub-populations 
would differentiate further. For example, the uniform distribution corresponding 
to F = 0.333 would be reached by sub-populations of size 75,000. For many species, 
perhaps most, even this is an unrealistically large size for sub-populations mating 
at random within themselves. The conclusion, therefore, is that recurrent mutation 
is negligible as a factor slowing down or arresting the differentiation of sub¬ 
populations by random drift. Migration, however, is quite a different matter. Rear¬ 
rangement of equation [4.12], with u + v = 0, gives F = l/(4N/n + 1). Thus 
the state of dispersion depends on the number of immigrants per generation, which 
is Njn, irrespective of the population size. This conclusion, which may at first seem 
paradoxical, can be understood by noting that a smaller population needs a higher 
rate of immigration than a larger one to be held at the same state of dispersion. 
Substitution of the values of F corresponding to the distributions in Fig. 4.1 gives 
the values of N e m entered in the figure. For example, one immigrant every alter¬ 
native generation {N/n = 1) is sufficient to maintain the flat distribution of gene 
frequencies corresponding to F = 0.333. The conclusion is that quite small numbers 
of immigrants will prevent much differentiation by random drift. The reason why 
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mutation and migration are so different in their effects is the same as was pointed 
out in Chapter 2: realistic mutation rates are very much smaller than realistic migration 
rates. 

The situation to which the foregoing consideration of migration refers is known 
as the ‘island model’. It pictures a discontinuous population with immigrants to any 
sub-population coming from any other sub-population with equal probability. A more 
realistic model is the ‘neighbourhood model’ or ‘isolation by distance’. The popula¬ 
tion is pictured as being continuously distributed over the area inhabited, but sub¬ 
divided into ‘neighbourhoods’ by the limited distance that individuals travel between 
birth and reproduction. A neighbourhood is the area within which mating is effec¬ 
tively random, and corresponds to a sub-population. Gene frequencies, however, 
vary continuously from neighbourhood to neighbourhood across the area. Since 
immigrants to a neighbourhood come from close by more often than from further 
away, they differ in gene frequency less than immigrants in the island model do. 
Therefore migration is less effective in counteracting random drift. The conclusion 
to which the neighbourhood model leads is that a large amount of local differentia¬ 
tion will take place if the effective number in the neighbourhoods is of the order 
of 20, a moderate amount if it is of the order of 200, but a negligible amount if 
it is larger than about 1,000. 

Selection 

Selection operating on a locus in a large population brings the gene frequency to 
an equilibrium, at an intermediate value when selection favours heterozygotes and 
at a low value when selection is balanced against mutation. The dispersive process 
tends to shift the gene frequency away from its equilibrium value. This reduces the 
average fitness of the population, because the load is minimal at the equilibrium, 
and some sub-populations may even become fixed for the deleterious allele. The 
effect of selection is stronger the further the gene frequency is away from the 
equilibrium value. So the opposing forces of selection and random drift reach a 
balance at which there is a stable distribution of gene frequencies among sub¬ 
populations. The question then is: how small must the sub-populations be to cause 
appreciable differentiation with its consequent deviations from the optimal gene fre¬ 
quency? The following illustrative cases will have to suffice for an answer, and for 
an understanding of the joint effects of mutation, selection, and dispersion the reader 
must consult other sources. 

Consider first selection favouring heterozygotes. The effect depends on the 
equilibrium gene frequency. When the two homozygotes are at an equal disadvan¬ 
tage and the equilibrium gene frequency is consequently 0.5, the distributions of 
gene frequencies look roughly like those in Fig. 4.1. The least dispersed one, cor¬ 
responding to F = 0.005, would be attained by a selection coefficient of s = 0.1 
against both homozygotes in sub-populations of size N e = 1,000. More dispersion 
would need less selection or smaller populations. The most dispersed distribution, 
with a substantial amount of fixation, would need very roughly s = 0.01 with N e 
= 100. Thus selection for heterozygotes does not allow much random drift unless 
the selection is very weak (around 1 per cent) or the population size very small (around 
100). If, however, the equilibrium gene frequency is not 0.5, the selection is less 
effective in preventing the random drift. When the equilibrium gene frequency is 
above roughly 0.8 or below 0.2 the selection actually accelerates the random drift 
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Fig. 4.2. Theoretical distributions of gene frequency among sub-populations when the 
dispersion is balanced by mutation and selection. The graphs refer to a recessive gene with 
u = v = ^s, in populations of size: (a) N e = 50 js, ( b ) N e = 5/s, and (c) N e = 0.5/s. (Based on 
Wright, 1942.) 


(Robertson, 1962). The reason for this is that one homozygote is then much fitter 
than the other and the selection increases the probability of fixation of the more fit 
homozygote. 

Next consider selection against a recessive allele balanced by recurrent muta¬ 
tion. This is difficult to illustrate because with realistic values of the selection coef¬ 
ficient, the equilibrium gene frequency will be very low, and the distributions are 
squeezed up against the limit near q = 0. Figure 4.2 shows three stable distribu¬ 
tions for very weak selection with an equilibrium gene frequency of about q = 0.2. 
Mutation rate is taken to be the same in both directions, and the coefficient of selec¬ 
tion s is 20 times the mutation rate. If we assume a mutation rate of 10 “ 5 then s 
= 20 x 10 ~ 5 and the population sizes to which the distributions refer are (a) 
250,000, ( b ) 25,000, and (c) 2,500. The conclusion is, again, that selection does 
not allow much random drift unless the selection is weak or the population size very 
small. The amount of random drift depends approximately on the product of the 
population size and the selection coefficient N e s, and the three distributions in Fig. 
4.2 correspond to values of NgS equal to 50, 5, and 0.5 respectively. 

Example 4.2 The opposing forces of dispersion and selection are illustrated in Fig. 4.3, 
from an experiment with Drosophila melanogaster (Wright and Kerr, 1954). The fre¬ 
quency of the sex-linked gene ‘Bar’ was followed for 10 generations in 108 lines each 
maintained by 4 pairs of parents. (On account of the complication of sex-linkage, which 
increases the rate of dispersion, the theoretical effective number was 6.765: the effec¬ 
tive number as'judged from the actual rate of dispersion was N e = 4.87.) The initial 
gene frequency was 0.5. The circles in the figure show the distribution of the gene fre¬ 
quency among the lines in the fourth to tenth generations, when the distribution had reached 
its steady form. The smooth curve shows the theoretical distribution based on N e = 5 
and a coefficient of selection against Bar of s = 0.17. Previously fixed lines are not 
included in the distributions. Altogether, at the tenth generation, 95 of the 108 lines had 
become fixed for the wild-type allele and 3 for Bar while 10 remained unfixed. Thus, 
despite a 17 per cent selective disadvantage, the deleterious allele was fixed in about 
3 per cent of the lines. 
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Number of bar genes 

Fig. 4.3. Distribution of gene frequencies under inbreeding and selection, as explained in 
Example 4.2. (Data from Wright and Kerr, 1954.) 


Random drift in natural populations 

Having described the dispersive process and its theoretical consequences, we may 
now turn to the more practical question of how far these consequences are actually 
seen in natural populations. The answering of this question is beset with difficulties, 
and the following comments are intended more to indicate the nature of these dif¬ 
ficulties than to answer the question. 

The theory of small populations, outlined in this and the preceding chapter, is 
essentially mathematical in nature and is unquestionably valid: given only the 
Mendelian mechanism of inheritance, the conclusions arrived at are a necessary con¬ 
sequence under the conditions specified. The question at issue, then, is whether the 
conditions in natural populations are often such as would allow the dispersion of 
gene frequencies to become detectable. The phenomena which would be expected 
to result from the dispersive process, if the conditions were appropriate, are dif¬ 
ferentiation between the inhabitants of different localities, and differences between 
successive generations. Both these phenomena are well known in subdivided or small 
isolated populations, and it is tempting to conclude that because they are the expected 
consequences of random drift, random drift must be their cause. But there are other 
possible causes: the environmental conditions probably differ from one locality to 
another and from one season to another; so the intensity, or even the direction, of 
selection may well vary from place to place and from year to year, and the differences 
observed could equally well be attributed to variation of the selection pressure. Before 
we can justifiably attribute these phenomena to random drift, therefore, we have 
to know: (1) that the effective population size is small enough; (2) that the sub¬ 
populations are well enough isolated (or the size of the ‘neighbourhoods’ sufficiently 
small); and (3) that the genes concerned are subject to very little selection. 
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The estimation of the present size of a population, though not technically easy, 
presents no difficulties of principle. But the present state of differentiation depends 
on the population size in the past, and this can generally only be guessed at. It is 
difficult to know how often the population may have been drastically reduced in 
size in unfavourable seasons, and the dispersion taking place in these generations 
of lowest numbers is permanent and cumulative. If a species colonizes a new ter¬ 
ritory, the founding members of the new sub-population may be very few in numbers, 
causing a substantial amount of random drift in the first generation. This is called 
the founder effect. If the sub-population then expands, its difference from the main 
population may seem much too great to be consistent with its present numbers. To 
attribute the difference to a founder effect may often be plausible but, in the absence 
of pedigree records, can seldom be other than a guess. One of the clearest and most 
interesting examples of isolated populations being differentiated as a result of founder 
effects is seen in the Amish communities in the USA, studied by McKusick (1978), 
the founder effects being established by genealogical records. 

There is less difficulty in deciding whether the sub-populations are sufficiently 
well isolated. With a discontinuous population it is often possible to be reasonably 
sure that there is not too much immigration; and with a continuous population the 
size of the ‘neighbourhoods’ is, at least in principle, measurable. The greatest dif¬ 
ficulty lies in estimating the intensity of natural selection acting on the genes con¬ 
cerned. Selection of an intensity far lower than could be detected experimentally 
is sufficient to check dispersion in all but the smallest populations. It seems rather 
unlikely — though this is no more than an opinion — that any gene that modifies 
the phenotype enough to be recognized visually would have so little effect on fitness. 
Many of the genes concerned with enzyme polymorphisms, however, may have selec¬ 
tion coefficients low enough to allow populations to become differentiated. The genes 
concerned with quantitative differences may also be nearly enough neutral for random 
drift to take place. There is no doubt at all that genes of this sort do show random 
drift in laboratory populations, as will be shown in later chapters. 


Problems 

4.1 Suppose that four Drosophila stocks are maintained by putting a fixed number 
of unmated adults in a bottle and allowing them to mate at random. All stocks have 
10 female parents but different numbers of male parents, the numbers of males being 
10, 5, 2 and 1 respectively. Calculate the effective population size of each stock 
and the inbreeding coefficient after 10 generations. Assume that there are no 
differences of fertility among females or among males. 

[Solution 83] 


4.2 The sex ratio among breeding individuals can be expressed as the number of 
females per male. Modify equation [4.4*] so as to express N e in terms of the 
number of females, Nf, and the number of females per male, d. 

[Solution 93] 

4.3 Suppose that an isolated natural population goes through a regular 5-year 
cycle of numbers, with the numbers of breeding pairs in successive generations being 
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500, 50, 100, 200, 400. What is the effective population size and the rate of 
inbreeding? 

[Solution 103] 

4.4 Compare the (approximate) rates of inbreeding in two varieties of a plant, one 
of which is self-fertile and the other self-sterile, when both are propagated by ran¬ 
dom pollination among 20 individual plants. 

[Solution 113] 

4.5 It is planned to keep a mouse stock with 8 pair-matings per generation and 
minimal inbreeding. The plan, however, cannot be strictly adhered to because some 
pairs fail to provide the two offspring required. In one particular generation the 8 
matings provided the following numbers of offspring that were used as parents: 0, 
1, 1, 2, 2, 3, 3, 4. What was the effective population size in this generation? 

[Solution 123] 

4.6 The breeding plan for each of the lines of the mouse stock described in Prob¬ 
lem 3.6 was to mate 8 pairs and to use 2 offspring from each pair as parents of 
the next generation. If this plan had been strictly adhered to, what would have been 
the effective population size of the lines? What was actually the effective population 
size indicated by the data in Problem 3.6? 


[Solution 133] 



SMALL POPULATIONS: 

III. Pedigreed Populations and Close Inbreeding 


In the two preceding chapters the genetic properties of small populations were des¬ 
cribed by reference to the effective number of breeding individuals; and expres¬ 
sions were derived, in terms of the effective number, by means of which the state 
of dispersion of the gene frequencies could be expressed as the coefficient of in- 
breeding. The coefficient of inbreeding, which is the probability of any individual 
being an identical homozygote, was deduced from the population size and the specified 
breeding structure. It expressed, therefore, the average inbreeding coefficient of all 
individuals of a generation. When pedigrees of the individuals are known, however, 
the coefficient of inbreeding can be more conveniently deduced directly from the 
pedigrees, instead of indirectly from the population size. This method has several 
advantages in practice. Knowledge is often required of the inbreeding coefficient 
of individuals, rather than of the generation as a whole, and this is what the calcula¬ 
tion from pedigrees yields. In domestic animals, some individuals often appear as 
parents in two or more generations, and this overlapping of generations causes no 
trouble when the pedigrees are known. The first topic for consideration in this chapter 
is therefore the computation of inbreeding coefficients from pedigrees. The second 
topic concerns regular systems of close inbreeding. When self-fertilization is 
excluded, the rate of inbreeding expressed in terms of the population size is only 
an approximation, and the approximation is not close enough if the population size 
is very small. Under systems of close inbreeding, therefore, the rate of inbreeding 
must be deduced differently, and this is best done also by consideration of the 
pedigrees. 

When the coefficient of inbreeding is deduced from the pedigrees of real popula¬ 
tions, it does not necessarily describe the state of dispersion of the gene frequen¬ 
cies. It is essentially a statement about the pedigree relationships, and its 
correspondence with the state of dispersion is dependent on the absence of the 
processes that counteract dispersion, in particular on selection being negligible. We 
were able to use the coefficient of inbreeding as a measure of dispersion in the 
preceding chapters because the necessary conditions for its relationship with the 
variance of gene frequencies were specified. 

Pedigreed populations 

The inbreeding coefficient of an individual 

This coefficient is the probability that the pair of alleles carried by the gametes that 
produced it were identical by descent. Computation of the inbreeding coefficient 
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therefore requires no more than the tracing of the pedigree back to common ancestors 
of the parents and computing the probabilities at each segregation. Consider the simple 
pedigree in Fig. 5.1, representing a mating between half sibs. X is the individual 
whose inbreeding coefficient Fx we want to know. Its parents P and Q are related 
through their common parent A. They are not related in any other way, so we only 
have to consider the transmission of A’s genes through P and Q to X, and to calculate 
the probability of X being an identical homozygote. Let Aj and A 2 symbolize the 
genes carried by A at any particular locus. The probability that X is A^ is 1/16 
= (£) 4 because the chance that Ai is transmitted through each of the four paths AP, 
PX, AQ, QX, is k for each path. The probability that X is A 2 A 2 is similarly (i) , 
and the probability that X is either A,A L or A 2 A 2 is 2(?) 4 = (i) 3 = »• This prob¬ 
ability of X being an identical homozygote represents the new inbreeding arising 
from A as a common ancestor of P and Q. The common ancestor A may, however, 
itself be an identical homozygote through previous inbreeding, in which case X will 
be an identical homozygote also if it gets the genotype A,A 2 or A 2 Aj (the two be¬ 
ing distinguished according to whether A) comes through P or through Q). The 
probability of each of these genotypes is (Z) 4 for the same reason as before, and 
the probability of one or the other is ( 2 ) 3 . The probability of A being an identical 
homozygote is its inbreeding coefficient, F A . The additional probability^of X be¬ 
ing an identical homozygote through the previous inbreeding is then (?) F A . Put¬ 
ting the two parts of the inbreeding together gives the inbreeding coefficient of X 
as p x = (i) 3 + (0 3 F a = (1) 3 (1 + F a ). Note that the index 3 is the number of 
individuals in the path connecting the parents through their common ancestor, i.e., 
individuals P, A, and Q. This makes it easy to work out the probabilities simply 
by counting individuals in the path. In Fig. 5.1 there are only the parents and the 
common ancestor; in Fig. 5.2 the common ancestor is further back and the individuals 
to be counted are P, D, B, A, C, Q, making 6. F x in Fig. 5.2 is therefore (?) 
(1 + F a ). In more complicated pedigrees, the parents may be related to each other 
through more than one common ancestor, or from the same common ancestor through 
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different paths, as illustrated in Example 5.1. Each common ancestor, and each path, 
then contributes an additional probability of the progeny being an identical 
homoyzgote, and the inbreeding coefficient is obtained by adding together the separate 
probabilities for each of the paths through which the parents are related. 

Putting all this together gives the following general formula for the inbreeding 
coefficient of an individual: 


F x = E(i)"(l + Fa) ... [5.1] 

where n is the number of individuals in any path of relationship counting the parents 
of X, the common ancestor, and all individuals in the path connecting parents to 
common ancestor; summation is over all paths of relationship. When inbreeding 
coefficients are calculated in this way, it is necessary to define the base population 
to which the present inbreeding is referred. Individuals in the base population are 
assigned inbreeding coefficients of zero. In practice the individuals of the base popula¬ 
tion may be simply those at the head of the pedigree, whose ancestry further back 
is not known. 


Example 5.1. The pedigree in Fig. 5.3 will illustrate the use of the formula [5.1]. The 
individual whose inbreeding coefficient is to be calculated is X. We have to look for 
paths through which X’s parents, P and Q, are related to each other. Paths contributing 
nothing to the relationship are dotted. It is assumed that there are no relationships be¬ 
tween any of the individuals other than those shown. There are four individuals that are 
common ancestors, A, B, F, and J, causing relationship between P and Q. The paths 
of relationship and the calculation of F x (rounded to four decimal places) are shown 
in the table. The inbreeding coefficient of X works out to be 0.0606. The following points 
should be noted: (1) D and E are full sibs. Their relationship causes some inbreeding 
in P, one of the parents of X, but it causes no relationship between P and Q and so con- 
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tributes nothing to F x . (2) E and F are half sibs, and the inbreeding coefficient of J, 
one of the common ancestors of P and Q, is therefore 1/8 as in Fig. 5.1. (3) There are 
four paths connecting P with Q through B as a common ancestor, and all four must be 
included in the calculation. (4) No individual can appear twice in the same path. For 
example, PMJEBFJNQis not a valid path, because the inbreeding it produces 
is fully taken account of by the inbreeding coefficient of J in the shorter path P M J 
N Q. (5) Finally, care must be taken not to traverse paths in the wrong direction: for 
example, F cannot transmit genes to P through K and N. 


Paths of relationship 

n 

F of common 
ancestor 

Contribution 
to F x 

PLDAEJNQ 

8 

0 

(I) 8 = 0.0039 

PLDBEJNQ 

8 

0 

(j) 8 = 0.0039 

PLDBFJNQ 

8 

0 

(i) 8 = 0.0039 

PLDBFKNQ 

8 

0 

II 

o 

8 

LO 

M2 

PMJEBFKNQ 

9 

0 

(|) 9 = 0.0020 

P M J F K N Q 

7 

0 

(£> 7 = 0.0078 

P M J N Q 

5 

1 

8 

(i) 5 x | = 0.0352 




F x = 0.0606 


When pedigrees are long and complicated, it may not be practicable to trace all 
the paths of relationship. A sufficiently accurate estimate of the inbreeding coeffi¬ 
cient can, however, be got by sampling a limited number of paths (Wright and 
McPhee, 1925). 

Coancestry or kinship 

There is another method of computing inbreeding coefficients which is often more 
convenient and is more readily adapted to a variety of problems. It will be used 
in the next section to work out the inbreeding coefficients under regular systems 
of close inbreeding. Its chief uses in practice are for planning matings to give the 
least inbreeding, and for calculating the inbreeding coefficient generation by genera¬ 
tion in a fully pedigreed population. The method does not differ in principle from 
the formula [5.1] given above, but instead of working from the present back to the 
common ancestors we work forward, keeping a running tally generation by genera¬ 
tion, and compute the inbreeding that will result from the matings now being made. 
The inbreeding coefficient of an individual depends on the amount of common 
ancestry in its two parents. Therefore, instead oif thinking about the inbreeding of 
the progeny, we can think of the degree of relationship by descent between the two 
parents. This is called the coancestry, or the coefficient of kinship or of consanguinity . 
It will be symbolized by /. The coancestry of any two individuals is identical with 
the inbreeding coefficient of their progeny if they were mated. Thus the coancestry 
of two individuals is the probability that two gametes taken at random, one from 
each, carry alleles that are identical by descent. 

Consider the generalized pedigree in Fig. 5.4. X is an individual with parents 
P and Q and grandparents A, B, C, and D. Now, the coancestry of P with Q is 
fully determined by the coancestries relating A and B with C and D, and if these 
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A x B 

P x 

X 

Fig. 5.4 


are known we need go no further back in the pedigree. It can be shown that the 
coancestry of P with Q is simply the mean of the four coancestries AC, AD, BC, 
and BD. This will be clearer if stated in the form of probabilities, though the explana¬ 
tion is cumbersome when put into words. Take one gamete at random from P and 
one from Q, and repeat this many times. In half the cases, P’s gamete will carry 
a gene from A and in half from B: similarly for Q’s gamete. So the two gametes, 
one from P and one from Q, will carry genes from A and C in a quarter of the 
cases, from A and D in a quarter, from B and C in a quarter, and from B and D 
in a quarter of the cases. Now the probability that two gametes taken at random, 
one from A and the other from C, are identical by descent is the coancestry of A 
with C, i.e., / AC , etc. So, reverting now to symbols, 

/>Q = 4/aC + 4/ad + 4/bc + 4^BD 

This gives the basic rule relating coancestries in one generation with those in the next: 

F X = fpQ = 4 (Zac + /ad h /bC + /bd) • • • [5-2] 

With this rule the experimenter can tabulate the coancestries generation by generation, 
and this gives a basis for planning matings and computing inbreeding coefficients. 
More detailed accounts of the operation are given by Plum (1954). 

If there is overlapping of generations we may need to find the coancestry of 
individuals belonging to different generations, for which a supplementary rule is 
needed. Consideration of probabilities shows that the coancestry of two individuals 
is equivalent to the mean coancestry of one individual with the two parents of the 
other. Thus, referring to the same pedigree (Fig. 5.4), the rule giving the ancestry 
of P with C and with D is 

/pc = ? (Zac + /bc) 

/pD = 2, (/ad + /bd) 

This rule gives also 

/pQ = 2 (/pc + /pd) 

which by substitution from equation [5.3] reduces to the basic rule of equation [5.2]. 

Before we can apply the method to a pedigreed population, or to regular systems 
of inbreeding, we need to know the numerical values of some coancestries. The 
parents of the first generation have to be assumed to be all unrelated, with / = 0. 
The first non-zero coancestries are among their progeny, and when all these have 
been determined all subsequent generations can be calculated by the rules given above. 


[5-3] 


C x D 
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The relationships whose coancestries may be needed in the first generation are off¬ 
spring and parent, full sibs, half sibs, and self. The coancestries of these relation¬ 
ships are needed also in the next section for working out the consequences of continued 
inbreeding. The coancestries are as follows, starting with self because this appears 
in all the others. 

Self. The coancestry of an individual with itself, f AA , is the inbreeding coefficient 
of progeny that would be produced by self-mating. This is the probability that two 
gametes taken at random from A will carry identical alleles, which is 2 O + F A ) 
for the following reason. Let A’s genes be A t and A 2 . The probability that two 
gametes taken at random are both A] or both A 2 is k. The probability that one is 
Aj and the other A 2 is but then the probability that A] and A 2 are identical by 
descent is the inbreeding coefficient of A, F A . Thus the total probability that the 
two gametes carry identical alleles is | + jF a , and so 

/aa = HI + Fa) ... [5.4] 

IfF A is known (or assumed) to be zero, then/ AA = j. 

Offspring and parent are in different generations, so the supplementary rule [5.3] 
is applicable. In Fig. 5.4 the coancestry of P with A is equal to the mean coancestry 
of P’s parents with A, i.e., 

/pa “ ? (/ab + /aa) • • • [5-5] 

If it is known or assumed that A and B are not related and A is not inbred, then 
/ab = 0 , f AA = 2 , and the coancestry reduces to / PA = 7 . 

Full sibs are in the same generation, so the basic rule [5.2] applies. The applica¬ 
tion of the rule is more easily understood if the pedigree is written as in Fig. 5.5. 

A B A B 


P Q 

I j l 

X 

Fig. 5.5 

A and B are the parents of both P and Q, which are full sibs and have an offspring 
X. Applying the basic rule [5.2] and noting that / AB = / BA , we have 

/pq = 4 (2/ ab + f AA + / BB ) ... [5.6] 

With no previous inbreeding or relationship this reduces to / PQ = i 

Half sibs. Figure 5.1 gives a pedigree of half sibs. Applying the basic rule [5.2] 
and noting that A is a parent of both P and Q, gives 

/pq = 4 (f A b + f A c + /bc + /aa) • • • [5-7] 

With no previous inbreeding or relationship this reduces to / P q = 1/8. This result 
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has already been obtained as the inbreeding coefficient of X, the offspring of P and 
Q, in Fi$§5.1. 

Regular systems of inbreeding 

A regular system of inbreeding is one in which the same mating system is applied 
in all generations, and all individuals in the same generation have the same inbreeding 
coefficient. Regular systems are most often used to produce rapid inbreeding, and 
so the matings are between close relatives. We shall deal first with matings be¬ 
tween the four sorts of relative already considered. (For other systems, see Wright, 
1933, 1969). Then we shall deal with the inbreeding produced by backcrossing and 
in the generations following a cross. 

Close inbreeding 

The inbreeding coefficients in successive generations can be calculated from the 
coancestries given in equations [5.4] to [5.7]. But it is more convenient first to derive 
recurrence equations which relate the inbreeding coefficient in one generation to 
those of previous generations. The generation we are interested in is denoted by 
t, the previous one by t — 1, and the one before that by t — 2; t — 3 is as far 
back as we have to go with these four systems. The recurrence equations are derived 
as follows, and the inbreeding coefficients in successive generations are given in 
Table 5.1. 

Self-fertilization. If X in generation t is the offspring of A in generation t — 1, 
equation [5.4] gives 

Fx = /aa = HI + Fa) 

and the recurrence equation is therefore 

F t = HI + F f _,) ... [5.8] 

In the first generation the parents are non-inbred and F t _ x = 0, which makes F (f=1) 
= j. In the second generation F t _ x = j, and F (l=2) becomes j(l + ?) = i Pro¬ 
ceeding in this way allows one to write down the inbreeding coefficients in each 
successive generation. Note that in this case the rate of inbreeding is constant from 
the beginning and it corresponds exactly with equation [3.7]: A F = 1/2 N = k. This 
is not true of the other systems. Self-fertilization gives the most rapid inbreeding 
possible with a normal mating system. The inbreeding coefficient reaches 99.9 per 
cent after 10 generations. It is possible, however, to get complete homozygotes in 
one step by some forms of parthenogenesis and by manipulations such as doubling 
the chromosome complement of haploid cells. 

We shall deal with full-sib mating next because it is the most often used of the 
other systems. 

Full sibs. From the coancestry in equation [5.6], referring to Fig. 5.5, we have 
Fx = fpQ = * (2/ab + /aa + Ibb) 

To get the recurrence equation we have to express the coancestries as inbreeding 
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coefficients of previous generations. First, = F P = F t _\. Since individuals in 
the same generation have the same inbreeding coefficient, / AA = / BB = Kl + F A ) 
by equation [5.4], and F A = F t _ 2 . Making these substitutions leads to the recur¬ 
rence equation 

F t = H 1 + 2F m + F,_ 2 ) ... [5.9] 

In the first generation, F t _ x and F t _ 2 are both zero and so = 0.25. The in¬ 
breeding coefficients in the first four generations are 0.25, 0.375, 0.50, and 0.59. 
The rate of inbreeding is not constant in the first few generations, as may be seen 
by computing AF from equation [3.9]. For the first four generations AF is 0.25, 
0.17, 0.20, and 0.19. It later settles down to a constant value of 0.191. The effec¬ 
tive population size is then N e = 2.6 by equation [4.1]. 

Offspring-parent. We consider here only the mating of offspring with their younger 
parent; repeated backcrossing to the same parent will be considered later. Figure 
5.6 shows as much of the pedigree as is needed, lettered to correspond with Fig. 
5.4 and the coancestry in equation [5.5]. Each individual is an offspring in one genera¬ 
tion and a parent in the next. The inbreeding of X is given by equation [5.5] as 

B Generation 



Fig. 5.6 


p x = f pA = £ (f AB + f AA ). The recurrence equation is obtained by substituting 
f AB = f p = F t _j, and/ AA = 1(1 + F^ = \ + \F t _ 2 . The recurrence equation 
then becomes identical with that for full sibs, equations [5.9]. This is true, however, 
only for autosomal genes; for sex-linked genes, parent—offspring mating gives a 
slightly higher rate of inbreeding, with AF = 0.293 after the first few generations 
(Wright, 1933). 

Half sibs. Figure 5.1 gives the individuals to which the coancestry in equation [5.7] 
refers. To get the recurrence equation for repeated half-sib matings we have to know 
the relationship between individuals B and C. These could be either half sibs to each 
other or full sibs. With animals, B and C are usually females, both mated to the 
same male A. To continue half-sib mating with the equivalents of B and C always 
half sibs, it is necessary to mate one of the females to a second male, making 4 
individuals as parents in each generation. This is difficult in practice, but if it is 
done the recurrence equation, obtained by substitutions in the same manner as above, 
becomes 


F t = i(l + 6F t _! + F t _a) 


... [5.10] 
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Table 5.1 Inbreeding coefficients under various systems of close inbreeding, and 
probability of fixation under full-sib mating. 


Generation 
( t) 

A 

B 

(1) 

B 

(2) 

C 

D 

0 

0 

0 

0 

0 

0 

1 

0.500 

0.250 

0 

0.125 

0.250 

2 

0.750 

0.375 

0.063 

0.219 

0.375 

3 

•0.875 

0.500 

0.172 

0.305 

0.438 

4 

0.938 

0.594 

0.293 

0.381 

0.469 

5 

0.969 

0.672 

0.409 

0.449 

0.484 

6 

0.984 

0.734 

0.512 

0.509 

0.492 

7 

0.992 

0.785 

0.601 

0.563 

0.496 

8 

0.996 

0.826 

0.675 

0.611 

0.498 

9 

0.998 

0.859 

0.736 

0.654 

0.499 

10 

0.999 

0.886 

0.785 

0.691 

,S 

11 

( 

0.908 

0.826 

0.725 

s 

12 

f 

0.926 

0.859 

0.755 

.s 

13 

, 

0.940 

0.886 

0.782 

■ s 

14 


0.951 

0.908 

0.806 


15 

i 

0.961 

0.925 

0.827 

, s 

16 

1 

0.968 

0.940 

0.846 

• s 

17 

1 

0.974 

0.951 

0.863 


18 

I 

0.979 

0.960 

0.878 


19 


0.983 

0.968 

0.891 

c~ 

20 

f 

0.986 

0.975 

0.903 


Column 

System of mating 

Recurrence equation 

A 

Self-fertilization or repeated 




backcrosses to highly inbred 




line. 



2 (1 + F m ) 


B 

Full brother x sister, or 





offspring x younger parent: 



0) 

Inbreeding coefficient. 


4 (1 + 2F f _, + F,_ 2 ) 

(2) 

Probability of fixation (from 




Schafer, 1937). 




C 

Half sib (females half sisters). 

t (1 + 6F r _, + F,_ 2 ) 

D 

Repeated backcrosses to 





random-bred individual. 


* 0 + 2F,_,) 



It is easier to continue half-sib mating with B and C being always full sibs, and 
the number of parents in each generation is then three. The inbreeding then goes 
a little faster and the recurrence equation is 

F, = ,4(3 + 8F m + 4F f _ 2 + F t _ 3 ) .. . [5.11] 


Fixation 

One is often more interested in the probability of fixation as a consequence of in- 
breeding than in the inbreeding coefficient. The inbreeding coefficient gives the prob¬ 
ability of an individual being a homozygote, which is 1 — 2/> 0 <?o(l — F) from 
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Table 3.1. But one wants to know also how soon all individuals in a line can be 
expected to be homozygous for the same allele at a particular locus. The probability 
of fixation depends on the number of alleles and their arrangement in the initial 
matings of the line. The probabilities of fixation at any one locus over the first 20 
generations of full-sib mating are given in column B (2) of Table 5.1 when four 
alleles were present in the initial mating. There cannot, of course, be more than 
four alleles in a sib-mated line, and when there are fewer the probability of fixation 
is greater (see Haldane, 1955). After the first few generations the probability of 
fixation becomes only*a little less than the inbreeding coefficient. 

A question of more practical importance is the probability of fixation at all loci, 
or the proportion of the whole genome that is expected to be fixed. This is the degree 
of ‘purity’ implied by the term ‘pure line’ which is often used to mean a highly inbred 
line. We cannot get the probability of total fixation from the presumed number of 
loci and the probability of fixation at any one, because when one locus becomes 
fixed neighbouring loci linked to it become fixed too. Consequently the probability 
of total fixation depends on the total map length and, to a lesser extent, on the number 
of chromosomes. This is a complicated matter and cannot be explained here: for 
details see Stam (1980). Total fixation, as must be obvious, takes much longer than 
fixation at any one locus. For example, after 20 generations of sib-mating in an 
organism like the mouse virtually no individual can be expected to be completely 
homozygous and no line totally fixed. To reach the expectation that 95 per cent of 
individuals will be completely homozygous requires about 50 generations (Stam, 
1980). After this number of generations the probability of total fixation is nearly 
the same, So, putting the matter the other way round, we can conclude that after 
50 generations about 5 per cent of the genome is expected to be still heterogeneous 
in a sib-mated line of mice. New heterogeneity caused by mutation is described in 
Chapter 15. 

Repeated backcrosses 

Repeated backcrosses to an individual or to a highly inbred line are often made, 
for a variety of purposes. The resulting inbreeding is as follows. The pedigree (Fig. 
5.7) shows an individual A, which will probably be a male, mated to his daughter 
C, his granddaughter D, etc. From the supplementary rule [5.5] 

Fx = /ad = 2 (f\ a + Ac) 

= 2 (HI + f a ) + F D } 


A x B 

|_ 


A x C 

I_, 

I 

A x D 

I 

X 


Fig. 5.7 



Regular systems of inbreeding 


95 


The recurrence equation is therefore 

F t = 1(1 + F a + 2F t _i) ... [5.12] 

where F A is the inbreeding coefficient of the individual to which the repeated 
backcrosses are made. If A is an individual from the base population and F A = 0, 
the equation becomes 

F t = 1(1 + 2F m ) ... [5.13] 

The inbreeding coefficients over the first 9 generations are given in Table 5.1. If 
A is an individual from a highly inbred line and F A = 1, the equation becomes 

F t = HI + F t _ i). ... [5.14] 

which is identical with the equation for self-fertilization. In this case A need not 
be the same individual in successive generations: it can be any member of the inbred 
line. 

The chief use of repeated backcrosses is to transfer a particular gene from one 
strain into the genetic background of another strain. A problem then arises as to 
the length of foreign chromosome that will be transferred along with the desired 
gene. A dominant gene can be transferred by successive crosses of the heterozygote 
to the strain into which it is to be introduced. It can be shown (see Crow and Kimura, 
1970, p. 94) that in this case the mean length of chromosome introduced with the 
gene after ? crosses is approximately 100/? cM on each side of the gene, or 200/? cM 
altogether. (1 centimorgan (cM) is the map distance corresponding to a recombina¬ 
tion frequency of 1 per cent.) A recessive gene is commonly transferred by alter¬ 
nating backcrosses and intercrosses from which the homozygote is extracted. The 
mean length of foreign chromosome in this case is about 200 It cM on each side, 
or 400/? cM altogether, after ? cycles (Bartlett and Haldane, 1935). From the length 
of linked chromosome transferred and the total map length of the organism, we can 
arrive at the expected proportion of the total genome that is still heterogeneous. Sup¬ 
pose, for example, that a dominant gene is transferred to an inbred mouse strain 
by five backcrosses. The gene would carry with it a length of linked chromosome 
amounting to 200/5 = 40 cM. Taking the total map length of the mouse to be 
1,600 cM (Roderick and Davisson, 1981), this heterogeneous segment would repre¬ 
sent 2.5 per cent of the total genome. In addition, some proportion of the genome 
not associated with the gene being transferred is expected to be still heterogeneous. 
This can be taken as approximately 1 ~ F which, from column A of Table 5.1, is 
3 per cent after 5 backcrosses. So in all about 5.5 per cent of the genome is expected 
to be still heterogeneous. An exact treatment of the problem is given by Stam and 
Zeven (1981). The transference of histocompatibility genes has special problems, 
which are considered by Johnson (1981). 

Crosses and subsequent generations 

A standard procedure in genetical analysis and in breeding, particularly plant 
breeding, is to make crosses between highly inbred lines and to raise the F 1? F 2 and 
subsequent generations. What is the inbreeding coefficient in the subsequent genera¬ 
tions if these are maintained as a large random-bred population? This question is 
easily answered by consideration of types of gamete, but it is not difficult to verify 
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Generation 

2-way 

cross 

4-way cross 

Inbreds 

A > 

1 

c B 

A x B C x D 

1 1 


r 

1 

n rn 

F, 

X,> 

< X 2 

x, X 2 Y, Y 2 




XjXYj X 2 x Y 2 

f 2 

O, ( 

V- 

• 

1 1 

Zj x z 2 

f 3 



l I ^ 

0, o 2 

Fig. 5.8 





the solution by the rules of coancestry. We shall consider populations derived from 
2-way and from 4-way crosses, as shown in Fig. 5.8. The foundation generation 
of the random-bred population derived from the cross is represented by the individuals 
marked O, of which there are a large number to be mated at random. It is these 
individuals O whose inbreeding coefficient we have to find. 

The inbred lines have inbreeding coefficients of F = 1. This means that all the 
gametes produced by one inbred line are identical. Consequently the F, individuals 
of the same cross have identical genotypes. Therefore, in the 2-way cross all the 
matings of pairs of F, individuals to produce the F 2 are equivalent to the self- 
fertilization of one individual. The individuals O in F 2 are thus equivalent to the 
progeny of self-fertilization of one individual, and their inbreeding coefficient is 
F = 0.5. In the 4-way cross, individuals Z, and Z 2 are related as full sibs since 
X, and X 2 are genetically identical, and so are Yi and Y 2 , but X and Y are not 
related (compare Fig. 5.5). Consequently the individuals O in the F 3 generation 
have an inbreeding coefficient of F = 0.25. 

These inbreeding coefficients of the derived populations have no meaning unless 
the base population to which they refer is defined. The base population implicit in 
the reasoning above is some real or hypothetical random-breeding population from 
which the inbred lines were derived. The inbred lines used in the crosses are assumed 
to be a random sample of all possible lines produced without any change of the mean 
gene frequencies, i.e., with no selection. With the base population defined in this 
way, the meaning of the inbreeding coefficient of the derived population is as follows. 
If we made a large number of 2-way, or of 4-way, crosses each with a different 
set of inbred lines, the populations derived from the crosses would constitute a set 
of lines or sub-populations. The inbreeding coefficient would then indicate the 
expected amount of dispersion of gene frequencies among these lines. Populations 
.derived from 2-way crosses are equivalent to progenies of one generation of self- 
fertilization. The gene frequencies can therefore have only three values, 0, y, and 
1. Populations derived from 4-way crosses are equivalent to progenies of one genera¬ 
tion of full-sib mating, and the gene frequencies can have only five values, 0, i, 
y, i, and 1. 

Mixed inbreeding and crossing 

Many plants are ‘inbreedersreproducing normally by self-fertilization. In many 
of these, however, some cross-pollination regularly occurs. The proportion of cross- 
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ing varies widely, ranging, for example in lima beans and sorghum varieties, from 
around 5 per cent up to 50 per cent (Allard, Jain, and Workman, 1968). How much 
heterozygosity will the crossing generate? It is assumed that the whole population 
is large, and that whether an individual seifs or crosses is random, being unrelated 
to what its parents did. In any generation there are two sorts of progeny, those pro¬ 
duced by self-fertilization and those produced by crossing. Let C be the proportion 
of individuals produced by crossing; their inbreeding coefficient is zero. The pro¬ 
portion produced by selfing is (1-C), and their inbreeding coefficient is F t = 
$(1 + F M ) by equation [5.8]. The average inbreeding coefficient is therefore 

Ft = HI + F M )(1 - C) 

If the rate of crossing remains constant, the average inbreeding coefficient reaches 
an equilibrium level at which it remains. Then F t = F t _\, and rearrangement of the 
above equation gives the average inbreeding coefficient at equilibrium as 

1 - C 

F = ~- ...[5.15] 

1 + C 

where C is the proportion of individuals produced by cross-pollination. On the 
assumption that there is no selection for or against heterozygotes, the expected fre¬ 
quency of heterozygotes relative to a fully random breeding population is 1 - F, 
by equation [3.15]. Application of equations [5.15] and [3.15] shows that 5 percent 
of crossing generates heterozygosity amounting to 9.5 per cent of that of a random¬ 
breeding population, and 50 per cent of crossing generates 66.7 per cent. Studies 
of barley have shown the frequencies of heterozygotes at four esterase loci to be 
greater than expected from the known amount of crossing, which was 0.57 per cent, 
the excess being attributed to selection favouring heterozygotes (Allard, Kahler, and 
Weir, 1972). 

The effect of crossing on the structure of a predominantly inbreeding population 
is more important than the generation of heterozygosity. With no crossing, an 
inbreeding population consists of completely homozygous lines, and natural selec¬ 
tion operates through the elimination of the less well-adapted lines. Each local habitat 
is then inhabited by the line best adapted to it (Allard, Jain, and Workman, 1968), 
but no further adaptation can take place, nor new adaptation to different habitats. 
With some crossing, however, new lines are constantly generated, with genes recom¬ 
bined from the existing lines, and this allows continued, or new, adaptation to take 
place. Crossing also makes possible the elimination of deleterious genes that have 
arisen by mutation and been fixed by the inbreeding. 

The converse problem is also of interest, namely a small amount of inbreeding 
in a predominantly outbreeding population. Substitution of high values of C (the 
proportion crossing) into equation [5.15] shows that a small amount of selfing raises 
the average inbreeding coefficient by very little. The reason for this is that the popula¬ 
tion does not become differentiated into permanent lines; the progeny of selfing are 
most likely themselves to cross-breed. If the inbreeding is by full-sib mating rather 
than selfing, the inbreeding coefficient expressed in a way analogous to equation 
[5.15] is F = (1 - C)/( 1 +3C). Expressed in terms of the proportion of individuals 
produced by sib-mating, S{= 1 - Q, the formula becomes 
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F = —S— ■ ■ ■ [516] 

4-35 

(see Li, 1976, for details). For example, 5 per cent of full-sib mating in a large 
population would raise the average inbreeding coefficient from 0 to 1.2 per cent, 
and 10 per cent would raise it to 2.3 per cent. The practical implication of this is 
that anyone keeping a stock by random breeding, or by minimal inbreeding, need 
not worry about the consequences of an occasional sib mating, a point already noted 
in Chapter 4. 

Change of base: structured population 

The question to be considered here is not confined to pedigreed populations or close 
inbreeding. Having computed a coefficient of inbreeding with reference to a certain 
group of individuals as the base population, one may then want to know the inbreeding 
coefficient referred to a different base, either more or less remote in the ancestry. 
For example, an individual produced by a full-sib mating is 25 per cent inbred with 
reference to its parents. Its parents may themselves be inbred with reference to a 
more remote base. What is the inbreeding coefficient of the individual with reference 
to this more remote base population? This question implies a ‘structured popula¬ 
tion with a hierarchical subdivision into lines and sublines, as illustrated in Fig. 5.9. 
In Fig. 5.9, A represents the further-back base population with which we are con¬ 
cerned, B is a later stage, and X represents the individuals whose inbreeding coeffi¬ 
cient is to be calculated. The unlettered circles contemporary with B represent the 
subdivision of A into lines in the manner of Fig. 3.1. In a real population only one 
of these lines, B, may actually exist. Line B is then further divided into sublines 
and X is an individual in one of these. The solution comes from a consideration 
of the relative frequencies of heterozygotes. Let H x , Hr, and H A be the frequen¬ 
cies of heterozygotes among the contemporaries of X, B and A respectively. Then 
H x /H a = (H x IH b )(H b /H a ), and it follows from equation [3.15] that 

Px a ~ Px bPb-a ■ • • [5 17] 



Fig. 5.9 
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where P x a = 1 - F XA ; F X . A being the inbreeding coefficient of X referred to A 
as base, and similarly for the other subscripts. The relationship in equation [5.17] 
can be extended to any number of categories of subdivision, or stages of inbreeding. 

Equation [5.17] is an equivalent way of expressing what are known as Wright’s 
F-statistics, used to describe structured populations. In the terminology of the F- 
statistics, F JS is the inbreeding coefficient of an individual relative to its own sub¬ 
population and is equivalent to 1 - P X . B above; F ST is the average inbreeding of 
the sub-population relative to the whole population, equivalent to 1 - P B . A above; 
and F }T is the inbreeding coefficient of the individual relative to the whole popula¬ 
tion. The relationship equivalent to equation [5.17] is 

(1 - F it ) = (1 - Fjs)( 1 - F st ) 


Example 5.2. A strain of mice was bred for 42 generations with an effective popula¬ 
tion size of about 40, and was then inbred by full-sib mating for a further 11 generations 
(Falconer, 1971). What was the inbreeding coefficient at the end? The inbreeding pro¬ 
duced by the full-sib mating, i.e., from B to X in Fig. 5.9, was 0.908, from Table 5.1. 
Thus P X . B = 1 - 0.908 = 0.092. The inbreeding in the line when the sib-mating was 
started, i.e., from A to B, was as follows: with N e = 40, equation [4.1] gives A F = 
0.0125, and after 42 generations equation [3.12] gives F B . A = 0.410. Thus P B . A = 

1 _ 0.410 = 0.590. By equation [5.17], P XA = 0.590 x 0.092 = 0.054. Thus the 
inbreeding coefficient at the end, referred to the origin of the line as base, was 
^x-a = 1 - JVa = 0.946. 

Mutation 

After a long period of inbreeding, mutation may become an important factor in deter¬ 
mining the frequency of heterozygotes. If u is the mutation rate of a gene that has 
reached near-fixation in the line, then the frequency of heterozygotes at this locus 
due to mutation is 4 u under self-fertilization, and 12 u under full-sib mating, for 
autosomal loci (Haldane, 1936). These are very small frequencies if we are con¬ 
cerned with only one locus, but if the effects of all loci are taken together, mutation 
is not entirely negligible as a source of heterozygosis in long-inbred strains such 
as the widely used strains of mice. The practical consequences of the origin of 
heterogeneity by mutation are that the characteristics of a line slowly change through 
the fixation of mutant alleles, and that sub-lines become differentiated. An example 
is given in Chapter 15. 

Selection favouring heterozygotes 

When close inbreeding is practised, the object is generally to produce fixation, or 
homozygosis within the lines. It is therefore a matter of some importance to know 
how selection will affect the progress toward fixation. Selection against a deleterious 
recessive may prevent the deleterious allele from becoming fixed, but it will not 
delay the fixation of the more favourable allele. Selection that favours heterozygotes, 
however, is another matter. A consequence of inbreeding almost universally observed 
is a reduction of fitness, the reasons for which will be given in Chapter 14. Thus 
selection resists the inbreeding, since the more homozygous individuals are the less 
fit, and this can only mean that selection favours heterozygotes — not necessarily 
heterozygotes of the loci taken singly, but heterozygotes of segments of chromosome. 
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It is only necessary to have two deleterious genes, recessive or partially recessive, 
linked in repulsion, to confer a selective advantage on the heterozygotes of the seg¬ 
ment of chromosome within which the genes are located. It is therefore important 
to find out how the opposing tendencies of inbreeding and selection in favour of 
heterozygotes balance each other, in order to assess the reliability of the computed 
inbreeding coefficient as a measure of the probability of fixation. 

The outcome of the joint action of inbreeding and selection in favour of 
heterozygotes depends on whether there is replacement of the less fit lines by the 
more fit; in other words, on whether selection operates between lines or only within 
lines. Within any one line, selection against homozygotes only delays the progress 
toward fixation and cannot arrest it, the delay being roughly in proportion to the 
intensity of the selection (Reeve, 1955a). Table 5.2 shows the rates of inbreeding 
with various intensities of selection, when there are two alleles and selection acts 
equally against both homozygotes. (The rate of inbreeding, A F, is used here to mean 
the rate of dispersion of gene frequencies and, after the first few generations when 
the distribution of gene frequencies has become flat, it measures the rate of fixation, 
i.e., the proportion of unfixed loci that become fixed in each generation, as explained 
in Chapter 3.) The delay of fixation caused by selection is least under the closest 
systems of inbreeding. Thus the rate is halved under self-fertilization when the coef¬ 
ficient of selection is 0.67; under full-sib mating when it is 0.44; and under half-sib 
mating when it is 0.35. It will be seen from the table that the rate of inbreeding, 
though much reduced by intense selection, does not become zero until the coeffi¬ 
cient of selection rises to 1. If there is only one line, therefore, fixation eventually 
goes to completion unless both homozygotes are entirely inviable or sterile. 

If there are many lines, however, selection may arrest the progress of fixation and 
lead to a state of equilibrium, for the following reason. The amount by which the 
inbreeding has changed the frequency of a particular gene from its original value 
differs at any one time from line to line. In other words, the state of dispersion of 
the locus has gone further in some lines than in others. Now, if those lines in which 
the dispersion has gone furthest, and which are consequently most reduced in fitness, 


Table 5.2 Rate of inbreeding, A F, with selection favouring the heterozygote. 
(Except with self-fertilization, the rates are only approximate over the first few 
generations of inbreeding.) 


Coefficient of 
selection against 
the homozygotes 

A F(%) 



Self 

fertilization 

Full sib 

Half sib* 

(J) 




0 

50.00 

19.10 

13.01 

0.2 

44.44 

14.88 

9.32 

0.4 

37.50 

10.32 

5.67 

0.6 

28.57 

5.71 

2.48 

0.75 

20.00 

2.62 

0.82 

0.8 

16.67 

1.76 

0.46 


* Females full sisters to each other. 
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die out or are discarded, and if they are replaced by sub-lines taken from the lines 
in which it has gone least far, then the progress of the dispersive process will have 
been set back. When there is replacement of lines in this way, and the selection 
is sufficiently intense, a state of balance between the opposing tendencies of inbreeding 
and selection is reached. The intensity of selection needed to arrest the dispersive 
process has been worked out for regular systems of close inbreeding (Hayman and 
Mather, 1953). Some of the conclusions, for the case of two alleles with equal selec¬ 
tion against the two homozygotes, are given in Table 5.3, which shows the intensity 
of selection against the homozygotes which will (1) just allow fixation to go even¬ 
tually to completion, and (2) arrest the dispersive process at a point of balance where 
the frequency of heterozygotes is half its original value, i.e., where P = 1 - F = 
0.5. These figures show that only a moderate advantage of heterozygotes will suf¬ 
fice to prevent complete fixation. Under full-sib mating, for example, loci, or 
segments of chromosomes that do not recombine, with a 25 per cent disadvantage 
in homozygotes will not all go to fixation. And, of those with a 50 per cent disad¬ 
vantage, only about half will become fixed, no matter for how long the inbreeding 
is continued. 

It must be stressed, however, that prevention of fixation in this way can only take 
place when there is replacement of lines and sub-lines. The following breeding 
methods, for example, would allow replacement of lines: if seed, set by self- 
fertilization, were collected in bulk and a random sample taken for planting, and 
this were repeated in successive generations; or, if sib pairs of mice were taken at 
random from all the surviving progeny, so that the same amount of breeding space 
was occupied in successive generations. 

The conclusions outlined above refer to a single locus. If there were more than 
a few loci on different chromosomes all subject to selection against homozygotes 
of an intensity sufficient to arrest or seriously delay the progress of inbreeding, the 
total loss of fitness from all the loci would be very severe. Inbred lines of organisms 
with a high reproductive rate, such as plants and Drosophila , might well stand up 
to a total loss of fitness sufficient to keep several loci or segments of chromosome 
permanently unfixed. But the loss of fitness involved in preventing the fixation of 
more than two or three loci in an organism such as the mouse would be crippling. 
Under laboratory conditions the highly inbred strains of mice, after 100 or more 


Table 5.3 Balance between inbreeding and selection in favour of heterozygotes, 
when selection operates between lines. The figures are the selective disadvantages 
of homozygotes, s, expressed as percentages. Column (a) shows the highest value 
of s compatible with complete fixation. Column ( b) shows the value of s that leads 
to a steady state at P = 1 - F = 0.5. 


Mating system 

(a) 

(P = 0) 

(b) 

(P = 0.5) 

Self-fertilization 

50.0 

66.7 

Full-sib 

23.7 

44.6 

Half-sib 

18.8 

47.2 

(females half sisters) 
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generations of sib-mating, have a fitness not much less than half that of non-inbred 
strains. It is conceivable that they might have one locus permanently unfixed, but 
it is difficult to believe that they can have more. Complete lethality or sterility of 
both homozygotes at one locus means a 50 per cent loss of progeny; at two unlinked 
loci, a 75 per cent loss. A mouse strain with a mortality or sterility of 50 per cent 
can be kept going, but hardly one with 75 per cent. 

Problems 

5.1 What are the inbreeding coefficients in the offspring of marriages between the 
following relatives? (1) single first cousins, (2) double first cousins, (3) uncle-niece. 

[Solution 4] 

5.2 What is the coancestry of the children of a pair of identical twins married to 
unrelated individuals? 

[Solution 14] 

5.3 The following is a human pedigree of the absence of the corpus callosum. In 
generation I, individuals 1 and 4 are full sibs and so are 2 and 3. In generation IV, 
X represents a family of eight with two affected individuals. Calculate the inbreeding 
coefficient of this family and that of its parent, III 2. 



Ill >.1 -1- 2 ^ 


Data from Shapira, Y. & Cohen, T. (1973)7. Med . Genet., 10, 266-9. 

[Solution 24] 

5.4 If a predominantly self-fertilizing plant regularly cross-pollinates with a fre¬ 
quency of 1 per cent, what will be the frequency of heterozygotes at a 2-allele locus 
with gene frequencies of 0.2 and 0.8, assuming no selection? 


[Solution 34] 
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5.5 Suppose that a population of a predominantly self-fertilizing plant is polymor¬ 
phic for two alleles, a and b, and the frequencies of the three genotypes are 

aa ab bb 

0.54 0.12 0.34 

What frequency of cross-pollination does this indicate, assuming there is no selection? 

• [Solution 44] 

5.6 What would be the inbreeding coefficient of a population kept for 10 genera¬ 
tions with an effective size of 16, and then for a further 10 generations with double 
that size? To how many generations of full-sib mating is it equivalent? 

[Solution 54] 

5.7 Two highly inbred lines of a plant are crossed to produce an F) generation. 
The F, individuals are selfed to produce an F 2 . Individuals of the F 2 are then 
backcrossed to the F, and to one of the inbred lines. What are the inbreeding 
coefficients of the progeny of these two backcrosses? 


[Solution 64] 

5.8 Consanguineous marriage increases the risk of the children suffering from 
recessive diseases. Work out how much the risk is increased by cousin marriage 
(single first cousins) for (1) cystic fibrosis with a population incidence of 1/2,500 
and (2) phenylketonurea with an incidence of 1/11,000. 


[Solution 74] 

5.9 Suppose that a proportion, y, of individuals in a population are produced by 
consanguineous matings giving them all an inbreeding coefficient of F , while the 
remainder, 1 - y, are produced by random mating, e.g. a human population with 
some cousin marriages. If homozygotes of a recessive gene occur with an incidence 
of I in the population as a whole, show that the gene frequency, q , of the recessive 
allele is estimated from the overall incidence, /, by 

(1 - yF)q 2 + yFq = 1 


[Solution 84] 
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It will be obvious, to biologists and laymen alike, that the sort of variation discussed 
in the foregoing chapters embraces but a small part of the naturally occurring varia¬ 
tion. One has only to consider one’s fellow men and women to realize that they 
all differ in countless ways, but that these differences are nearly all matters of degree 
and seldom present clear-cut distinctions attributable to the segregation of single 
genes. If, for example, we were to classify individuals according to their height, 
we could not put them into groups labelled ‘tall’ and ‘short’, because there are all 
degrees of height and a division into classes would be purely arbitrary. Variation 
of this sort, without natural discontinuities, is called continuous variation, and 
characters that exhibit it are called quantitative characters or metric characters, 
because their study depends on measurement instead of on counting. The genetic 
principles underlying the inheritance of metric characters are basically those of 
population genetics outlined in the previous chapters. But since the segregation of 
the genes concerned cannot be followed individually, new methods of study are needed 
and new concepts have to be introduced. The branch of genetics concerned with 
metric characters is called quantitative genetics or biometrical genetics. The impor¬ 
tance of this branch of genetics need hardly be stressed; most of the characters of 
economic value to plant and animal breeders are metric characters, and most of the 
changes concerned in micro-evolution are changes of metric characters. It is therefore 
in this branch that genetics has its most important application to practical problems 
and also its most direct bearing on evolutionary theory. 

How does it come about that the intrinsically discontinuous variation caused by 
genetic segregation is translated into the continuous variation of metric characters? 
There are two reasons: one is the simultaneous segregation of many genes affecting 
the character, and the other is the superimposition of truly continuous variation arising 
from non-genetic causes. Consider, for example, a simplified situation. Suppose there 
is segregation at 6 unlinked loci, each with 2 alleles at frequencies of 0.5. Suppose 
that there is complete dominance of one allele at each locus and that the dominant 
alleles each add one unit to the measurement of a certain character. Then if the 
segregation of these genes were the only cause of variation there would be 7 discrete 
classes in the measurements of the character, according to whether the individual 
had the dominant allele present at 0, 1,2 .. ., or 6 of the loci. The frequencies 
of the classes would be according to the binomial expansion of (i + I) 6 , as shown 
in Fig. 6.1(a). If our measurements were sufficiently accurate we should recognize 
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Fig. 6.1. Distributions expected from the simultaneous segregation of two alleles at each of 
several or many loci: (a) 6 loci, (b) 24 loci. There is complete dominance of one allele over the 
other at each locus, and the gene frequencies are all 0.5. Each locus, when homozygous for the 
recessive allele, is supposed to reduce the measurement by 1 unit in (a), and by * unit in ( b ). The 
horizontal scale, representing the measurement, shows the number of loci homozygous for the 
recessive allele, and the vertical axis shows the probability, or the percentage of individuals 
expected in each class. The probabilities are derived from the binomial expansion of (| +§)", 
where n is the number of loci. 


these classes as being distinct and we should be able to place any individual unam¬ 
biguously in its class. If there were more genes segregating but each had a smaller 
effect, there would be more classes with smaller differences between them, as in 
Fig. 6.1 (b). It would then be more difficult to distinguish the classes, and if the 
difference between the classes became about as small as the error of measurement 
we should no longer be able to recognize the discontinuities. In addition, metric 
characters are subject to variation from non-genetic causes, and this variation is truly 
continuous. Its effect is, as it were, to blur the edges of the genetic discontinuity 
so that the variation as we see it becomes continuous, no matter how accurate our 
measurements may be. 

Thus the distinction between genes concerned with Mendelian characters and those 
concerned with metric characters lies in the magnitude of their effects relative to 
other sources of variation. A gene with an effect large enough to cause a recognizable 
discontinuity even in the presence of segregation at other loci and of non-genetie 
variation can be studied by Mendelian methods, whereas a gene whose effect is not 
large enough to cause a discontinuity cannot be studied individually. This distinc¬ 
tion is reflected in the terms major gene and minor gene. There are, however, all¬ 
intermediate grades, genes that cannot properly be classed as major or as minor. 
And, furthermore, as a result of pleiotropy the same gene may be classed as major 
with respect to one character and minor with respect to another character. The distinc¬ 
tion, though convenient, is therefore not a fundamental one, and there is no good 
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evidence that there are two sorts of genes with different properties. Variation caused 
by the simultaneous segregation of many genes may be called polygenic variation, 
and the minor genes concerned are sometimes referred to as polygenes. 

Metric characters 

The metric characters that might be studied in any higher organism are almost 
infinitely numerous. Any attribute that varies continuously and can be measured might 
in principle be studied as # a metric character — anatomical dimensions and propor¬ 
tions, physiological functions of all sorts, and mental or psychological qualities. The 
essential condition is that they should be measureable. The technique of measure¬ 
ment, however, sets a practical limitation on what can be studied. Usually rather 
large numbers of individuals have to be measured and the study of any character 
whose measurement requires an elaborate technique therefore becomes impracticable. 
Consequently the characters that have been used in studies of quantitative genetics 
are predominantly anatomical dimensions, or physiological functions measured in 
terms of an end-product, such as lactation, fertility, or growth rate. 

Some examples of metric characters are illustrated in Fig. 6.2. The variation is 
represented graphically by the frequency distribution of measurements. The 
measurements are grouped into equally spaced classes and the proportion of 
individuals falling in each class is plotted on the vertical scale. The resulting histogram 
is discontinuous only for the sake of convenience in plotting. If the class ranges were 
made smaller and the number of individuals measured were increased indefinitely, 
the histogram would become a smooth curve. The variation of some metric characters, 
such as bristle number or litter size, is not strictly speaking continuous because, 
being measured by counting, their values can only be whole numbers. Nevertheless, 
one can regard the measurements in such cases as referring to an underlying character 
whose variation is truly continuous though expressible only in whole numbers, in 
a manner analogous to the grouping of measurements into classes. For example, 
litter size may be regarded as a measure of the underlying, continuously varying 
character, fertility. For practical purposes such characters can be treated as con¬ 
tinuously varying, provided the number of classes is not too small. When there are 
too few classes, as for example when susceptibility to disease is expressed as death 
or survival, different methods have to be employed, as will be explained in Chapter 
18. 

The frequency distributions of most metric characters approximate more or less 
closely to normal curves. This can be seen in Fig. 6.2, where the smooth curves 
drawn through the histograms are normal curves having means and variances 
calculated from the data. In the study of metric characters it is therefore possible 
to make use of the properties of the normal distribution and to apply the appropriate 
statistical techniques. Sometimes, however, the scale of measurement must be 
modified if a distribution approximating to the normal is to be obtained. The distribu¬ 
tion in Fig. 6.2(d), for example, would be skewed if measured and plotted simply 
as the number of facets. But it becomes symmetrical, and approximates to a normal 
distribution, if measured and plotted in logarithmic units. The criteria on which the 
choice of a scale of measurement rests cannot be fully appreciated at this stage, and 
will be explained in Chapter 17. Meantime it will be assumed that any metric character 
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Fig. 6.2. Frequency distributions of four metric characters, with normal curves superimposed. 
The means are indicated by arrows. The characters are as follows, the number of observations 
on which each histogram is based being given in brackets: 

(a) Mouse (SS): growth from 3 to 6 weeks of age. (380) 

( b ) Mouse: litter size (number of live young in 1st litters). (689) 

(c) Drosophila melanogaster (99): number of bristles on ventral surface of 4th and 5th 
abdominal segments, together. (900) 

(d) Drosophila melanogaster (99): number of facets in the eye of the mutant “Bar”. (488) 
(a), ( b ), and (c) are from original data: (d) is from data of Zeleny (1922). 


under discussion is measured on an appropriate scale and has a distribution that is 
approximately normal. 

General survey of the subject-matter 

There are two basic genetic phenomena concerned with metric characters, both more 
or less familiar to all biologists, and each forms the basis of a breeding method. 
The first is the resemblance between relatives. Everyone is familiar with the fact 
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that relatives tend to resemble each other, and the closer the relationship, in general 
the closer the resemblance. Though it is only in our own species that resemblances 
are readily discernible without measurement, the phenomenon is equally present in 
other species. The degree of resemblance varies with the character, some showing 
more, some less. The resemblance between offspring and parents provides the basis 
for selective breeding. Use of the more desirable individuals as parents brings about 
an improvement of the mean level of the next generation, and just as some characters 
show more resemblance than others, so some are more responsive to selection than 
others. The degree*of resemblance between relatives is one of the properties of a 
population that can be readily observed, and it is one of the aims of quantitative 
genetics to show how the degree of resemblance between different sorts of relatives 
can be used to predict the outcome of selective breeding and to point to the best 
method of carrying out the selection. The application of genetic principles to selec¬ 
tive breeding of farm animals has led to very substantial improvements, with large 
economic benefits; the rates of improvement achieved have been between about 1 
and 3 per cent per year (Smith, 1984, 1988). The principles underlying selective 
breeding form the central theme of the next seven chapters, the resemblance be¬ 
tween relatives being dealt with in Chapters 9-10, and the effects of selection in 
Chapters 11—13. 

The second basic genetic phenomenon is inbreeding depression, with its converse 
hybrid vigour, or heterosis. This phenomenon is less familiar to the layman than 
the first, since the laws against incest prevent its more obvious manifestations in 
our own species; but it is well known to animal and plant breeders. Inbreeding tends 
to reduce the mean level of all characters closely connected with fitness in animals 
and in naturally outbreeding plants, and to lead in consequence to loss of general 
vigour and fertility. Since most characters of economic value in domestic animals 
and plants are aspects of vigour or fertility, inbreeding is generally deleterious. The 
reduced vigour and fertility of inbred lines is restored on crossing, and in certain 
circumstances this hybrid vigour can be made use of as a means of improvement. 
The enormous improvement of the yield of commercially grown maize has been 
achieved by this means, yields having been approximately doubled in the USA since 
1935 (Hallauer and Miranda, 1981, p. 408; for improvements in some other crop 
plants see Simmonds, 1979, p. 47). The effects of inbreeding and crossing are 
described in Chapters 14—16. 

The properties of a population that we can observe in connection with a metric 
character are means, variances, and covariances. The natural subdivision of the 
population into families allows us to analyse the variance into components which 
form the basis for the measurement of the degree of resemblance between relatives. 
We can in addition observe the consequences of experimentally applied breeding 
methods, such as selection, inbreeding, or cross-breeding. The practical objective 
of quantitative genetics is to find out how we can use the observations made on the 
population as it stands to predict the outcome of any particular breeding method. 
The more general aim is to find out how the observable properties of the population 
are influenced by the properties of the genes concerned and by the various non-genetic 
circumstances that may influence a metric character. The chief properties of genes 
that have to be taken into account are the degree of dominance, the manner in which 
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genes at different loci combine their effects, pleiotropy, linkage, and fitness under 
natural selection. To take account of all these properties simultaneously, in addition 
to a variety of non-genetic circumstances, would make the problems unmanageably 
complex. We therefore have to simplify matters by dealing with one thing at a time, 
starting with the simpler situations. 

The plan to be followed in the succeeding chapters is this: we shall first show 
what determines the population mean, and then introduce two new concepts — average 
effect and breeding value — which are necessary to an understanding of the variance. 
Then we*shall discuss the variance, its analysis into components, and the covariance 
of relatives, which will lead us to the degree of resemblance between relatives. In 
all this we shall take full account of dominance from the beginning: the other com¬ 
plicating factors will be more briefly discussed when they become relevant. The 
most important simplification that we shall make concerns the effect of genes on 
fitness: we shall assume that Mendelian segregation is undisturbed by differential 
fitness of the genotypes. The description of means, variances, and covariances will 
refer to a random-breeding population, with Hardy—Weinberg equilibrium genotype 
frequencies, with no selection and no inbreeding. That is to say, we shall describe 
the population before any special breeding method is applied to it. Then in Chapters 
11 — 13 we shall describe the effects of selection, and in Chapters 14—16 the effects 
of inbreeding. This will cover the fundamentals of quantitative genetics, and in the 
final chapters we shall discuss some special topics. 


Problems 

6.1 The figures tabulated are the number of leaves per plant in 25 and 25 F 2 
plants from a cross of two cultivated varieties of tobacco, which had mean leaf 
numbers of 15.0 and 17.9 respectively. Tabulate (and plot if desired) the frequency 
distributions of the F[ and F 2 generations. From each distribution calculate the 
mean, the variance, and the standard error of the mean. What is the main difference 
between the two distributions? 


Fl 


18 

15 

16 

18 

15 

16 

14 

16 

18 

17 

16 

13 

16 

14 

16 

15 

16 

15 

15 

16 

15 

16 

16 

15 

16 


f 2 


16 

20 

19 

17 

14 

16 

14 

14 

15 

17 

20 

13 

12 

15 

16 

21 

18 

15 

14 

18 

14 

17 

13 

15 

13 


Data from Johnson J. (1919) Genetics , 4, 307—40. 


[Solution 5] 
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6.2 The table gives the weights (g) at 60 days of age of 50 male mice, bred from 
12 pairs of parents, in one generation of a strain that had been bred selectively for 
small size. Tabulate (and plot if desired) the frequency distribution and calculate 
the mean and variance. What is peculiar about the distribution? 


12.8 

15.7 

13.7 

5.7 

6.1 

12.8 

14.1 

14.0 

17.8 

14.7 

5.7 

11.7 

13.4 

6.6 

6.8 

11.8 

11.4 

10.7 

15.6 

14.9 

11.8 

17.2 

15.0 

15.1 

14.6 


13.2 

12.5 

7.0 

6.6 

14.8 

11.8 

13.5 

14.1 

16.2 

6.6 

15.6 

13.0 

11.5 

12.1 

15.4 

4.7 

13.1 

13.4 

14.5 

13.4 

14.8 

15.1 

14.9 

18.4 

16.5 


Data kindly supplied by Dr J.W.B. King. 


[Solution 15] 


6.3 Work out the frequency distribution of the genotypic classes, from which to 
construct a histogram (like Fig. 6.1), when a metric character is affected by 4 in¬ 
dependently segregating loci, each with 2 alleles and complete dominance. The 
recessive alleles when homozygous each reduce the measurement by 1 unit. All 
recessive alleles have a gene frequency of 0.3. 

[Solution 25] 

6.4 Work out the frequency distribution with everything the same as in Problem 
6.3 except that the frequencies of the recessive alleles are 0.3 at two of the loci and 
0.7 at the other two. 

[Solution 35] 

6.5 Work out the frequency distribution when there are three loci, each with 2 
alleles and with no dominance. At all loci, homozygotes differ by 2 units of measure¬ 
ment and heterozygotes differ by 1 unit from both homozygotes. At all loci the gene 
frequency of the allele that decreases the measurement is 0.4. 

[Solution 45] 

6.6 What gene frequency would produce a perfectly symmetrical distribution of 
measurement classes under the conditions of (1) Problem 6.3 and (2) Problem 6.5? 

[Solution 55] 
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We have seen in the early chapters that the genetic properties of a population are 
expressible in terms of the gene frequencies and genotype frequencies. In order to 
deduce the connection between these on the one hand and the quantitative differences 
exhibited in a metric character on the other, we must introduce a new concept, the 
concept of value , expressible in the metric units by which the character is measured. 
The value observed when the character is measured on an individual is the phenotypic 
value of that individual. All observations, whether of means, variances, or 
covariances, must clearly be based on measurements of phenotypic values. In order 
to analyse the genetic properties of the populations we have to divide the phenotypic 
value into component parts attributable to different causes. Explanation of the mean¬ 
ings of these components is our chief concern in this chapter, though we shall also 
be able to find out how the population mean is influenced by the array of gene 
frequencies. 

The first division of phenotypic value is into components attributable to the influence 
of genotype and environment. The genotype is the particular assemblage of genes 
possessed by the individual, and the environment is all the non-genetic circumstances 
that influence the phenotypic value. Inclusion of all non-genetic circumstances under 
the term ‘environment’ means that the genotype and the environment are by defi¬ 
nition the only determinants of phenotypic value. The two components of value 
associated with genotype and environment are the genotypic value and the environmen¬ 
tal deviation. We may think of the genotype conferring a certain value on the in¬ 
dividual and the environment causing a deviation from this, in one direction or the 
other. Or, symbolically, 

P = G + E ... [7.1] 

where P is the phenotypic value, G is the genotypic value, and E is the environmen¬ 
tal deviation. The mean environmental deviation in the population as a whole is taken 
to be zero, so that the mean phenotypic value is equal to the mean genotypic value. 
The term population mean then refers equally to phenotypic or to genotypic values. 
When dealing with successive generations we shall assume for simplicity that the 
environment remains constant from generation to generation, so that the population 
mean is constant in the absence of genetic change. If we could replicate a particular 
genotype in a number of individuals and measure them under environmental condi¬ 
tions normal for the population, their mean environmental deviations would be zero, 
and their mean phenotypic value would consequently be equal to the genotypic value 
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Genotype A 2 A 2 A,A 2 A, A, 

J _1_I_L_ 

Genotypic -a 0 d + a 

value 

Fig. 7.1. Arbitrarily assigned genotypic values. 


of that particular genotype. This is the meaning of the genotypic value of an in¬ 
dividual. In principle it is measurable, but in practice it is not, except when we are 
concerned with a single locus where the genotypes are phenotypically distinguishable, 
or with the genotypes represented in highly inbred lines. 

For the purposes of deduction we must assign arbitrary values to the genotypes 
under discussion. This is done in the following way. Considering a single locus with 
two alleles, A] and A 2 , we call the genotypic value of one homozygote +a, that 
of the other homozygote - a , and that of the heterozygote d. (We shall adopt the con¬ 
vention that Aj is the allele that increases the value.) We thus have a scale of 
genotypic values as in Fig. 7.1. The origin, or point of zero value, on this scale 
is mid-way between the values of the two homozygotes. The value d of the 
heterozygote depends on the degree of dominance. If there is no dominance, d = 
0; if is dominant over A 2 , d is positive, and if A 2 is dominant over Aj, d is 
negative. If dominance is complete, d is equal to +a or - a , and if there is over¬ 
dominance, d is greater than +« or less than -a. The degree of dominance may 
be expressed as d/a. 

Example 7.1 For the purposes of illustration in this chapter, and also later on, we shall 
refer to a dwarfing gene in the mouse, known as ‘pygmy’ (symbol pg), described by 
King (1950, 1955), and by Warwick and Lewis (1954). This gene reduces body-size 
and is nearly, but not quite, recessive in its effect on size. It was present in a strain of 
small mice (MacArthur’s) at the time the studies cited above were made. The weights 
of mice of the three genotypes at 6 weeks of age were approximately as follows (sexes 
averaged): 



Genotypes 



+ + 

+Pg 

Pg Pg 

Weight in grams 

14 

12 

6 


(The weight of heterozygotes given here is to some extent conjectural, but it is unlikely 
to be more than 1 g in error.) These are average weights obtained under normal en¬ 
vironmental conditions, and they are therefore the genotypic values. The mid-point in 
genotypic value between the two homozygotes is 10 g, and this is the origin, or zero- 
point, on the scale of values assigned as in Fig. 7.1. The value of a on this scale is therefore 
4 g, and that of d is 2 g. 

Population mean 

We can now see how the gene frequencies influence the mean of the character in 
the population as a whole. Let the gene frequencies of A] and A 2 be p and q respec- 
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tively. Then the first two columns of Table 7.1 show the three genotypes and their 
frequencies in a random breeding population, from formula [1.2]. The third column 
shows the genotypic values as specified above. The mean value in the whole popula¬ 
tion is obtained by multiplying the value of each genotype by its frequency and sum¬ 
ming over the three genotypes. The reason why this yields the mean value may be 
understood by converting frequencies to numbers of individuals. Multiplying the 
value by the number of individuals in each genotype and summing over genotypes 
gives the sum of values of all individuals. The mean value would then be this sum 
of values divided by the total number of individuals. The procedure in working with 
frequencies is the same, but since the sum of the frequencies is 1, the sum of values 
x frequencies is the mean value. In other words, the division by the total number 
has already been made in obtaining the frequencies. Multiplication of values by fre¬ 
quencies to obtain the mean value is a procedure that will be often used in this chapter 
and subsequent ones. Returning to the population mean, multiplication of the value 
by the frequency of each genotype is shown in the last column of Table 7.1. Sum¬ 
mation of this column is simplified by noting that p 2 - q 2 = (p + q)(p - q) = 
p - q. The population mean, which is the sum of this column, is thus 

M = a(p - q) + Idpq . . . [7.2] 

This is both the mean genotypic value and the mean phenotypic value of the popula¬ 
tion with respect to the character. 

The contribution of any locus to the population mean thus has two terms: 
a(p - q) attributable to homozygotes, and Idpq attributable to heterozygotes. If there 
is no dominance id = 0), the second term is zero, and the mean is proportional 
to the gene frequency: M = a (l -2 q). If there is complete dominance (d = a), 
the mean is proportional to the square of the gene frequency: M = a( 1 - 2 q 2 ). 
The total range of values attributable to the locus is 2a, in the absence of over- 
dominance. That is to say, if Aj were fixed in the population (p = 1) the popula¬ 
tion mean would be a, and if A 2 were fixed (q = 1) it would be -a. If the locus 
shows overdominance, however, the mean of an unfixed population may be outside 
this range. 

The genotypic values a and d are deviations from the mean value of the two 
homozygotes, as shown in Fig. 7.1. It follows that the population mean expressed 
in equation [7.2] is a deviation from the mid-homozygote value, which is the origin 


Table 7.1 


Genotype 

Frequency 

Value 

Freq. X Val. 

A(A, 

P 2 

+ a 

p 2 a 

A|A 2 

2 pq 

d 

2pqd 

A 2 A 2 

q 2 

-a 

-q 2 a 


Sum — 


a(p - q) + Idpq 
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or zero-point of the scale. If the mean is to be expressed as a deviation from some 
other value, an appropriate constant must be added or subtracted. For example, one 
might want to express the mean as a deviation from the value of the lower 
homozygote. This would require the addition of a, and the mean would become, 
after some simplification, M = 2 p{a + dq). Or, expressed as a deviation from 
the upper homozygote, it would be M = 2 q{-a + dp). 

Example 7.2 Let us take again the pygmy gene in mice, as described in Example 7.1, 
and see what effect this gene would have on the population mean when present at two 
particular frequencies. First, the total range is from 6 g to 14 g: a population consisting 
entirely of pygmy homozygotes would have a mean of 6 g, and one from which the gene 
was entirely absent would have a mean of 14 g. (These values refer specifically to MacAr- 
thur’s Small Strain at the time the observations were made.) Now suppose the gene were 
present at a frequency of 0.1 , so that under random mating homozygotes would appear 
with a frequency of 1 per cent. The values to be substituted in equation [7.2] are p = 
0.9, q = 0.1, and a = 4 g, d = 2 g, as shown in Example 7.1. The population mean, 
by equation [7.2], is therefore: M = 4 X 0.8 + 2 X 0.18 = 3.56. This value of the 
mean, however, is measured from the mid-homozygote point, which is 10 g, as origin. 
Therefore the actual value of the population mean is 13.56 g. Next suppose the gene 
were present at a frequency of 0.4. Substituting in the same way, we find M = 1.76, 
to which must be added 10 g for the origin, giving a value of 11.76 g. Rough corroboration 
of these figures is given by the records of the strain carrying the gene. When the gene 
was present at a frequency of about 0.4 the mean weight was about 12 g. Two genera¬ 
tions, later, when the pygmy gene had been deliberately eliminated, the mean weight 
rose to about 14 g. 

Now we have to put together the contributions of genes at several loci and find 
their joint effect on the mean. This introduces the question of how genes at different 
loci combine to produce a joint effect on the character. For the moment we shall 
suppose that combination is by addition, which means that the value of a genotype 
with respect to several loci is the sum of the values attributable to the separate loci. 
For example, if the genotypic value of is a A and that of is a B , then the 
genotypic value of AjA^B] is a A + a B . The consequences of non-additive com¬ 
bination will be explained at the end of this chapter. With additive combination, 
then, the population mean resulting from the joint effects of several loci is the sum 
of the contributions of each of the separate loci, thus: 

M = La(p - q) + IXdpq . . . [7.3] 

This is again both the genotypic and the phenotypic mean value. The total range 
in the absence of overdominance is now 2La. If all alleles that increase the value 
were fixed, the mean would be + Ea, and if all alleles that decrease the value were 
fixed, it would be — These are the theoretical limits to the range of potential vari 
ation in the population. The origin from which the mean value in equation [7.3] is 
measured is the mid-point of the total range. This is equivalent to the average mid¬ 
homozygote point of all the loci separately. 

Example 7.3 As an example of two loci that combine additively, we shall refer to two 
colour genes in mice, whose effects on the number of pigment granules have been des¬ 
cribed by Russell (1949). This is a metric character which reflects the intensity of pigmen¬ 
tation in the coat. The two genes are ‘brown’ (b) and ‘extreme dilution’ (c e ), an allele 
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of the albino series. Measurements were made of the number of melanin granules per 
unit volume of hair, in wild-type homozygotes, in the two single mutant homozygotes, 
and in the double mutant homozygote. We shall assume both wild-type alleles to be com¬ 
pletely dominant, so that only these four genotypes need be considered. The mean numbers 
of granules in the four genotypes were as shown in the table. 



B- 

bb 

2 a B 

c- 

95 . 


5 

c e c e 

38 

34 

4 

2a c 

57 

56 



The difference between the two figures in each row and in each column measures the 
homozygote difference, or 2 a on the scale of values assigned as in Fig. 7.1. Apart from 
the trivial discrepancy of 1 unit, these differences are independent of the genotype at 
the other locus. In other words, the difference of value between B- and bb is the same 
among C- genotypes as it is among c e c e genotypes; and similarly the difference between 
C- and c e c e is the same in B- as it is in bb. Thus the two loci combine additively, and 
the value of a composite genotype can be rightly predicted from knowledge of the values 
of the single genotypes. For example, the bb genotype is 5 units less than the wild-type, 
and the c e c e is 57 units less; therefore bb c e c e should be 62 units less than the wild- 
type value of 95, namely 33, which is almost identical with the observed value of 34. 

Average effect 

In order to deduce the properties of a population connected with its family struc¬ 
ture, we have to deal with the transmission of value from parent to offspring, and 
this cannot be done by means of genotypic values alone, because parents pass on 
their genes and not their genotypes to the next generation, genotypes being created 
afresh in each generation. A new measure of value is therefore needed which will 
refer to genes and not to genotypes. This will enable us to assign a ‘breeding value’ 
to individuals, a value associated with the genes carried by the individual and transmit¬ 
ted to its offspring. The new value associated with genes as distinct from genotypes 
is known as the average effect. Average effects depend on the genotypic values, 
a and d as previously defined, and also on the gene frequencies. Average effects 
are therefore properties of populations as well as of the genes concerned. The con¬ 
cept of average effects is not easy to grasp, but it is fundamental to understanding 
the inheritance of quantitative characters. There are several ways in which average 
effects can be defined. They are all equivalent under random mating but not other¬ 
wise (see Falconer, 1985); we are concerned here only with random breeding popula¬ 
tions. One definition is this: the average effect of a particular gene (allele) is the 
mean deviation from the population mean of individuals which received that gene 
from one parent, the gene received from the other parent having come at random 
from the population. This may be stated in another way. Let a number of gametes 
all carrying A T unite at random with gametes from the population; then the mean 
of the genotypes so produced deviates from the population mean by an amount which 
is the average effect of the Aj gene. 
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Table 7.2 


Type of 

Values and 

Mean value 

Population mean 

Average effect 

gamete 

frequencies of 
genotypes produced 

AiA, A|A 2 A 2 A 2 
a d -a 

of genotypes 
produced 

to be deducted 

of gene 

Ai 

P .<7 

pa + qd 

-[a(p-q) + 2 dpq] 

q[a + d(q-p)] 

a 2 

p q 

-qa + pd 

-[a(p-q) + 2 dpq] 

-p[a + d(q-p)] 


Let us see how the average effect is related to the genotypic values a and d, in 
terms of which the population mean was expressed. This will help to make the con¬ 
cept clearer. The reasoning is set out in Table 7.2. Consider a locus with two alleles. 
Ax and A 2 , at frequencies p and q respectively, and take first the average effect of 
the gene A t , for which we shall use the symbol c^. If gametes carrying A! unite 
at random with gametes from the population, the frequencies of the genotypes pro¬ 
duced will be p of A^j and q of AjA 2 . The genotypic value of AjA| is +a and 
that of AxA 2 is d, and the mean of these, taking account of the proportions in which 
they occur, is pa + qd. The difference between this mean value and the population 
mean is the average effect of the gene A!. Taking the value of the population mean 
from equation [7.2], we get 

o'] = pa + qd - [a(p - q) + 2dpq) ' 

= <j\a + d{q - p)] ... [7.4a] 

Similarly, the average effect of the gene A 2 is 

a 2 = -p[a + d(q - p)] ... [7.4 b] 

When there are more than two alleles the average effect of each allele can be expressed 
in a similar way. The reason why average effects depend on gene frequencies can 
be seen in the words ‘taken at random’ in the definition, because the content of a 
random sample depends on the gene frequency in the population. 

When only two alleles at a locus are under consideration it is more convenient 
to express their average effects in terms of the average effect of the gene substitu¬ 
tion. This is simply the difference between the average effects of the two alleles, 
but its meaning may be more clearly understood in the following way. Suppose that 
we could change A 2 genes chosen at random into Aj genes, as if by directed muta¬ 
tion, and could then note the resulting change of value; the mean change so produced 
would be the average effect of the gene substitution. When A 2 genes are chosen 
at random a proportion p will be found in AjA 2 genotypes (p being the gene fre¬ 
quency of A } ) and a proportion q in A 2 A 2 genotypes. Changing AjA 2 into A^ 
will change the value from d to +a, and the effect will therefore be (a - d ). Chang¬ 
ing A 2 A 2 into AjA 2 will change the value from -a to d, and the effect will be 
(d + a). The average change is therefore p(a - d) + q(d -I- a), which on rearrange¬ 
ment becomes a + d(q - p). Thus the average effect of the gene-substitution (written 
as a, without subscript) is 
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a = a + d(q - p) ... [7.5] 

From equations [7.4] it can readily be seen that 

a = oc\ “ a 2 

and that the average effects of the two alleles, when expressed in terms of the average 
effect of the gene substitution, are 


oc\ = qa 
ol 2 = -pa 


... [7.6] 


The breeding values of genotypes can conveniently be expressed in these terms, as 
will be seen in the next section. Another definition of the average effect of a gene- 
substitution will be given later, with a graphical representation in Fig. 7.2. 


Example 7.4 Consider again the pygmy gene and its effect on body weight, which 
was found in Example 7.1 to be a = 4 g and d = 2 g. If the frequency of the pg gene 
were q = 0.1, the average effect of the gene substitution would be, by equation [7.5], 
« = 4 + 2(0.1 - 0.9) = 2.4 g. And if the frequency were q = 0.4, the average effect 
of the gene substitution would be a = 4 + 2(0.4 - 0.6) = 3.6 g. The average effects 
of the genes separately, by equation [7.6], are as follows. 



q = 0.1 

q = 0.4 

Average effect of +: a; = 

+0.24 

+ 1.44 

Average effect of pg: a 2 = 

-2.16 

-2.16 

a = a, - oc 2 = 

2.40 

3.60 


Thus the average effect of the gene substitution, a, is greater when the gene frequency 
is greater. The identity of the average effects of pg at the two gene frequencies is only 
a coincidence. 

Breeding value 

The usefulness of the concept of average effect arises from the fact, already noted, 
that parents pass on their genes and not their genotypes to their progeny. It is therefore 
the average effects of the parents’ genes that determine the mean genotypic value 
of its progeny. The value of an individual, judged by the mean value of its progeny, 
is called the breeding value of the individual. Breeding value, unlike average effect, 
can therefore be measured. If an individual is mated to a number of individuals taken 
at random from the population, then its breeding value is twice the mean deviation 
of the progeny from the population mean. The deviation has to be doubled because 
the parent in question provides only half the genes in the progeny, the other half 
coming at random from the population. Breeding values can be expressed in absolute 
units, but are usually more conveniently expressed in the form of deviations from 
the population mean, as defined above. Just as the average effect is a property of 
the gene and the population, so is the breeding value a property of the individual 
and the population from which its mates are drawn. One cannot speak of an 
individual’s breeding value without specifying the population in which it is to be 
mated. 
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Defined in terms of average effects, the breeding value of an individual is equal 
to the sum of the average effects of the genes it carries, the summation being made 
over the pair of alleles at each locus and over all loci. Thus, for a single locus with 
two alleles, the breeding values of the genotypes are as follows: 


Genotype 

Breeding value 

AiAi 

2a i = 2qa 

AiA 2 

a 1 + a 2 = (q - p)a 

a 2 a 2 

2a 2 = -2 pa 


Example 7.5 Let us illustrate breeding values by reference to the pygmy gene in mice. 
The average effects of the + and pg genes were given in the last example. From these 
we may find the breeding values of the three genotypes as explained above. These breeding 
values, which are given below, are deviations from the population mean. The popula¬ 
tion means with gene frequencies of 0.1 and 0.4 were found in Example 7.2 and are 
shown again below in the column headed M. 



M 

Breeding values 





+ + 

+Pg 

Pg Pg 

q = 0.1 

13.56 

+0.48 

-1.92 

-4.32 

q = 0.4 

11.76 

+2.88 

-0.72 

-4.32 


(The breeding values of pygmy homozygotes are only hypothetical because in fact 
pygmy homozygotes are nearly all sterile: but this complication may be overlooked in 
the present context.) 


Extension to a locus with more than two alleles is straightforward, the breeding 
value of any genotype being the sum of the average effects of the two alleles pre¬ 
sent. If all loci are to be taken into account, the breeding value of a particular genotype 
is the sum of the breeding values attributable to each of the separate loci. If there 
is non-additive combination of genotypic values, a slight complication arises. We 
have given two definitions of breeding value, a practical one in terms of the measured 
value of the progeny and a theoretical one in terms of average effects. Non-additive 
combination renders these two definitions not quite equivalent. This point will be 
more fully explained in Chapter 9. 

Consideration of the definition of breeding value will show that in a population 
in Hardy—Weinberg equilibrium the mean breeding value must be zero; or if breeding 
values are expressed in absolute units the mean breeding value must be equal to 
the mean genotypic value and to the mean phenotypic value. This can be verified 
from the breeding values listed above. Multiplying the breeding value by the fre¬ 
quency of each genotype and summing gives the mean breeding value (expressed 
as a deviation from the population mean) as 

2 p 2 qa -I- 2 pq(q - p)a - 2 q 2 pa = 2pqa{p + q - p - q) — 0 

The breeding value is sometimes referred to as the ‘additive genotype’, and varia¬ 
tion in breeding value ascribed to the ‘additive effects’ of genes. Though we shall 
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not use these terms we shall follow custom in using the term ‘additive’ in connec¬ 
tion with the variation of breeding values to be discussed in the next chapter, and 
we shall use the symbol A to designate the breeding value of an individual. 

Because the breeding value expresses the value transmitted from parents to off¬ 
spring it follows that the expected breeding value of any individual is the average 
of the breeding values of its two parents; and it follows from the definition of breeding 
value that this is also the individual’s expected phenotypic value. Different offspring 
of the same parents will differ in breeding value according to which of each parent’s 
two alleles they receive; the ‘expected’ values are simply the mean values of a large 
number of offspring of the same parents. So the transmission of value from parents 
to offspring is expressed by 

P 0 = A 0 = \ (A s + A d ) ... [7.7] 

where the subscripts o, s, and d refer to offspring, sire, and dam respectively. 


Dominance deviation 

We have separated off the breeding value as a component part of the genotypic value 
of an individual. Let us consider now what makes up the remainder. When a single 
locus only is under consideration, the difference between the genotypic value G and 
the breeding value A of a particular genotype is known as the dominance deviation 
D , so that 


G = A + D ... [7.8] 

The dominance deviation arises from the property of dominance among the alleles 
at a locus, since in the absence of dominance, breeding values and genotypic values 
coincide. From the statistical point of view the dominance deviations are interactions 
between alleles, or within-locus interactions. They represent the effect of putting 
genes together in pairs to make genotypes; the effect not accounted for by the effects 
of the two genes taken singly. Since the average effects of genes, and the breeding 
values of genotypes, depend on the gene frequency in the population, the dominance 
deviations are also dependent on gene frequency. They are therefore partly proper¬ 
ties of the population and are not simply measures of the degree of dominance. 


Example 7.6 Continuing the example of the pygmy gene, we may now list the genotypic 
values and the breeding values, and so obtain the dominance deviations of the three 
genotypes, by equation [7.8]. These values, all now expressed as deviations from the 
population mean M, are given in the table. 



-si 

II 

p 

M = 13.56 

q = 0.4: M = 11.76 

+ + 

+Pg 

Pg Pg 

+ + 

+Pg 

Pg Pg 

Frequency 

0.81 

0.18 

0.01 

0.36 

0.48 

0.16 

Genotypic value, G 

+0.44 

-1.56 

-7.56 

+ 2.24 

+ 0.24 

-5.76 

Breeding value, A 

+0.48 

-1.92 

-4.32 

+2.88 

-0.72 

-4.32 

Dominance dev., D 

-0.04 

+0.36 

-3.24 

-0.64 

+0.96 

-1.44 
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A 2 A 2 AjA 2 AjA, 

Frequency 

q 2 2pq p 2 


Fig. 7.2. Graphical representation of genotypic values (closed circles), and breeding values 
(open circles), of the genotypes for a locus with two alleles, A t and A 2 , at frequencies p and q, as 
explained in the text. Horizontal scale: number of At genes in the genotype. Vertical scales of 
value: on left - arbitrary values assigned as in Fig. 7.1; on right - deviations from the 
population mean. The figure is drawn to scale for the values: d = \a , and q={. 


The relations between genotypic values, breeding values and dominance devia¬ 
tions can be illustrated graphically, as in Fig. 7.2, and the meaning of the dominance 
deviation is perhaps more easily understood in this way. In the figure the genotypic 
values (closed circles) are plotted against the number of A] genes in the genotype. 
A straight regression line is fitted by least squares to these points, each point being 
weighted by the frequency of the genotype it represents. The position of this line 
gives the breeding values of each genotype, as shown by the open circles. The dif¬ 
ferences between the breeding values and the genotypic values are the dominance 
deviations, indicated by vertical dotted lines. The cross marks the population mean. 
The average effect a of the gene-substitution is given by the difference in breeding 
value between A 2 A 2 and A,A 2 , or between A t A 2 and AjAj, as indicated. The 
original definition of the average effect of a gene-substitution was given by Fisher 
(1918, 1941) in terms of this linear regression of genotypic value on number of genes. 
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Table 7.3 Values of genotypes in a two-allele system, measured as deviations from 
the population mean. 

Population mean: M = a(p - q) + 2 dpq 

Average effect of gene-substitution: a = a + d(q - p) 



Genotypes 




A[A] 

a,a 2 

A 2 A 2 

Frequencies 

P 1 

2pq 

q 1 

Assigned values 

a 

d 

-a 

Deviations from 
population mean: 

Genotypic value 

\ 2q(a-pd) 

a{q-p)+d{l-2pq) 

-2 p(a+qd) 

\ 2q(a-qd) 

(q-p)a + 2pqd 

-2p(a+pd) 

Breeding value 

2 qa 

(q-p)cx 

-2 pa 

Dominance deviation 

-2 q 2 d 

2pqd 

-2 p 2 d 


The dominance deviation can be expressed in terms of the arbitrarily assigned 
genotypic values a and d, by subtraction of the breeding value from the genotypic 
value, as shown in Table 7.3. The genotypic values must first be converted to devia¬ 
tions from the population mean, because the breeding values have been expressed 
in this way. The genotypic values, so converted, are given in two forms: in terms 
of a and in terms of a. Let us take the genotype A,A 1 to show how these are 
obtained and how the dominance deviation is obtained by subtraction of the breeding 
value. The arbitrarily assigned genotypic value of AjAj is +a, and the population 
mean is a(p - q) + 2 dpq. Expressed as a deviation from the population mean, the 
genotypic value is therefore 

a - [a(p - q) + 2 dpq] = a(l - p + q) - 2dpq — 2qa - 2dpq = 2 q(a - dp) 


This may be expressed in terms of the average effect a by substituting a = a - 
d(q-p) (from equation [7.5]), and the genotypic value then becomes 2 q(a - qd ). 
Subtraction of the breeding value, 2 qa, gives the dominance deviation as -2 q 2 d. 
By similar reasoning the dominance deviation of AjA 2 is 2 pqd, and that of A 2 A 2 
is -2 p 2 d. Thus all the dominance deviations are functions of d. If there is no 
dominance, d is zero and the dominance deviations are also all zero. Therefore in 
the absence of dominance, breeding values and genotypic values are the same. Genes 
that show no dominance (d = 0) are sometimes called ‘additive genes’, or are said 
to ‘act additively’. 

Since the mean breeding value and the mean genotypic value are equal, it follows 
that the mean dominance deviation is zero. This can be verified by multiplying the 
dominance deviation by the frequency of each genotype and summing. The mean 
dominance deviation is thus 


-2p 2 q 2 d + 4p 2 q 2 d - 2p 2 q 2 d = 0 
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Interaction deviation 

When only a single locus is under consideration, the genotypic value is made up 
of the breeding value and the dominance deviation only. But when the genotype refers 
to more than one locus, the genotypic value may contain an additional deviation 
due to non-additive combination. Let G A be the genotypic value of an individual 
attributable to one locus, G B that attributable to a second locus, and G the aggregate 
genotypic value attributable to both loci together. Then 

• G = G a + G B + / AB ... [7.9] 

where / AB is the deviation from additive combination of these genotypic values. In 
dealing with the population mean, earlier in this chapter, we assumed that / was 
zero for all combinations of genotypes. If /is not zero for any combination of genes 
at different loci, those genes are said to ‘interact’ or to exhibit ‘epistasis’, the term 
epistasis being given a wider meaning in quantitative genetics than in Mendelian 
genetics. The deviation I is called the interaction deviation or epistatic deviation. 
If the interaction deviation is zero the genes concerned are said to ‘act additively’ 
between loci. Thus ‘additive action’ may mean two different things. Referred to 
genes at one locus it means the absence of dominance, and referred to genes at dif¬ 
ferent loci it means the absence of epistasis. 

Loci may interact in pairs or in threes or higher numbers, and the interactions 
may be of several different sorts, as the behaviour of major genes shows. The com¬ 
plex nature of the interactions, however, need not concern us, because in the aggregate 
genotypic value interactions of all sorts are treated together as a single interaction 
deviation. So for all loci together we can write 

G = A + D + I ... [7.10] 

where A is the sum of the breeding values attributable to the separate loci, and D 
is the sum of the dominance deviations. 

The mean interaction deviation of all the genotypes in a population is zero, when 
values are expressed as deviations from the population mean. That this must be so 
can be seen from equation [7.10], remembering that the mean G, A, and D are all 
zero. The interaction deviation is not just a property of the interacting genotypes, 
but depends also on the frequencies of the genotypes in the population, and so on 
the gene frequencies. 

Example 7.7 As an example of non-additive combination of two loci we shall take 
the same two colour genes in mice that were used in Example 7.3 to illustrate additive 
combination; but this time we refer to their effects on the size of the pigment granules, 
instead of their number (Russell, 1949). The mean size (diameter in n) of the granules 



B- 

bb 

Diff. 

c- 

1.44 

0.77 

0.67 

c e c e 

0.94 

0.77 

0.17 

Diff- 

0.50 

0.00 






Problems 


123 


in each of the four genotypes was as shown in the table. This time the differences are 
not independent of the other genotype: the c e gene, for example, has quite a large ef¬ 
fect on the B- genotype, but none at all on the bb genotype. Thus the two loci show 
epistatic interaction and do not combine additively. The interaction deviations of the four 
genotypes in any particular population would depend on the gene frequencies at both loci. 

Problems 

7.1 Three allelic variants of the red cell acid phosphatase enzyme were present 
in a sample from tfie population in England. The table below gives the genotypes 
with their frequencies in the sample and the mean enzyme activity of each genotype. 
(CC individuals were not found.) What is the mean enzyme activity in this population? 


Genotype 

Frequency {%) 

Enzyme activity 

AA 

9.6 

122 

AB 

48.3 

154 

BB 

34.3 

188 

AC 

2.8 

184 

BC 

5.0 

212 


Data from Spencer, N., et al. (1964) Nature, 201, 299—300. 

[Solution 6] 

7.2 With the enzyme activities of the red cell acid phosphatase genotypes given 
in Problem 7.1 calculate the mean enzyme activities in populations with the C allele 
absent and the following gene frequencies of A: (1) 0.2, (2) 0.5, (3) 0.8. 

[Solution 16] 

7.3 If there were a locus, overdominant with respect to a metric character, with 
the genotypic values given below, what gene frequency would give a random-mating 
population its maximum mean value, and what would the mean be? 

AjA] A|A 2 A 2 X 2 

110 150 90 


[Solution 26] 

7.4 Example 7.3 describes the effects of two colour genes on the number of pig¬ 
ment granules in the hairs of mice. The brown gene, b, when homozygous, reduced 
the number of granules from 95 to 90, and the extreme dilute gene, c e , reduced 
the number from 95 to 38. The genes’ effects in combination were additive. Assum¬ 
ing both genes to be completely recessive, find what would be the mean granule 
number in a population with the b gene at a frequency of 0.5 and the c e gene at 
a frequency of 0.2. 


[Solution 36] 
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7.5 What are the average effects of the two alleles, and the average effect of the 
gene substitution, in the populations specified in Problem 7.2? 

[Solution 46] 

7.6 What are the breeding values of the three genotypes in the three populations 
specified in Problem 7.2? 


[Solution 56] 

7.7 Find the breeding values and dominance deviations of the three genotypes in 
the population specified in Problem 7.3, when the mean is at its maximal value. 

[Solution 66] 

7.8 Calculate the breeding value and the dominance deviation of the genotype bb 
c e c e in the population specified in Problem 7.4. Give the breeding value both as 
a deviation from the population mean and in absolute units of granule number. 

[Solution 76] 

7.9 Problem 7.4 dealt with the effects of two colour genes on the number of pig¬ 
ment granules in mouse hairs. Example 7.7 describes the effects of the same two 
genes on the size of the pigment granules, and in this case the effects are not additive. 
Work out the interaction deviations of the genotypes in a population in which the 
bb homozygote has a frequency of 0.4 and the c e c e homozygote a frequency of 0.2. 
Since both genes are assumed to be completely recessive the dominant homozygote 
and the heterozygote in each case have the same value and can be treated as one 
genotype, so that there are four genotypes whose interaction deviations are to be 
found. It will be useful for a later problem if the values of the four genotypes given 
in Example 7.7 are first converted to deviations from the population mean; this may 
also make the logic clearer. 


[Solution 86] 



VARIANCE 


The genetics of a metric character centres round the study of its variation, for it 
is in terms of variation that the primary genetic questions are formulated. The basic 
idea in the study of variation is its partitioning into components attributable to dif¬ 
ferent causes. The relative magnitude of these components determines the genetic 
properties of the population, in particular the degree of resemblance between relatives. 
In this chapter we shall consider the nature of these components and how the genetic 
components depend on the gene frequency. Then, in the next chapter, we shall show 
how the degree of resemblance between relatives is determined by the magnitudes 
of the components. 


Components of variance 

The amount of variation is measured and expressed as the variance: when values 
are expressed as deviations from the population mean the variance is simply the mean 
of the squared values. The components into which the variance is partitioned are 
the same as the components of value described in the last chapter; so that, for exam¬ 
ple, the genotypic variance is the variance of genotypic values, and the environ¬ 
mental variance is the variance of environmental deviations. The total variance is 
the phenotypic variance, or the variance of phenotypic values, and is the sum of 
the separate components. The components of variance and the values whose variance 
they measure are listed in Table 8.1. 

The total variance is then, with certain qualifications, the sum of the components 
thus: ’ 


Vp — Vg + V E ' 

= VA + V D + Vj + V E 


... [8.1a] 
... [ 8 . 1 *] 


The qualifications are, first, that genotypic values and environmental deviations may 
be correlated, in which case V P will be increased by twice the covariance of G with 
E; and, second, there may be interaction between genotypes and environments, in 
which case there will be an additional component of variance attributable to the inter¬ 
action. These two complications will be dealt with later in this chapter; meantime 
it will be assumed that they do not apply. 


Components as proportions of the total 

The partitioning of the variance into its components allows us to estimate the relative 
importance of the various determinants of the phenotype, in particular the role of 
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Table 8.1 Components of variance 


Variance component 

Symbol 

Value whose variance 
is measured 

Phenotypic 

v P 

Phenotypic value 

Genotypic 

V G 

Genotypic value 

Additive 

Va 

Breeding value 

Dominance 

v D 

Dominance deviation 

Interaction 

K 

Interaction deviation 

Environmental 

Ve 

Environmental deviation 


heredity versus environment, or nature and nurture. The question of ‘relative 
importance’ can be answered only if it is expressed in terms of the variance attributable 
to the different sources of variation. The relative importance of a source of varia¬ 
tion is the variance due to that source, as a proportion of the total phenotypic variance. 
The relative importance of heredity in determining phenotypic values is called the 
heritability of the character. There are, however, two distinctly different meanings 
of ‘heredity’, and heritability, according to whether they refer to genotypic values 
or to breeding values. A character can be ‘hereditary’ in the sense of being deter¬ 
mined by the genotype or in the sense of being transmitted from parents to offspring, 
and the extent to which it is hereditary in the two senses may not be the same. The 
ratio V G /V P expresses the extent to which individuals’ phenotypes are determined 
by the genotypes. This is called the heritability in the broad sense , or the degree 
of genetic determination. The ratio V A /V P expresses the extent to which phenotypes 
are determined by the genes transmitted from the parents. This is called the heritability 
in the narrow sense , or simply the heritability. In all that follows, the term ‘heritabil¬ 
ity’ will be restricted to mean the narrow-sense heritability, V A /V P . The heritability 
V A /V P determines the degree of resemblance between relatives and is therefore of 
the greatest importance in breeding programmes. The degree of genetic deter¬ 
mination V G /V P is of more theoretical interest than practical importance. It can be 
estimated in the following way. 

Estimation of the degree of genetic determination, V G /V P 

Estimation of the genotypic variance V G is simple in theory though not so easy in 
practice. Neither the genotypic nor the environmental components of variance, V G 
and V E , can be estimated directly from observations on a single population, but in 
certain circumstances they can be estimated in experimental populations. If one or 
other component could be completely eliminated, the remaining phenotypic variance 
would provide an estimate of the remaining component. Environmental variance can¬ 
not be removed because it includes by definition all non-genetic variance, and much 
of this is beyond experimental control. Elimination of genotypic variance can, 
however, be achieved experimentally. Individuals with identical genotypes can be 
obtained from a highly inbred line or the Fj of a cross between two such lines, or 
from a clone propagated from a single individual. (Identical twins in man and cattle 
also provide individuals of identical genotype, but their use in partitioning the variance 
is very limited for reasons to be discussed in Chapter 10.) If a group of such 
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individuals is raised under the normal range of environmental circumstances, their 
phenotypic variance provides an estimate of the environmental variance V E . Sub¬ 
traction of this from the phenotypic variance of a genetically mixed population then 
gives an estimate of the genotypic variance of this population. This estimation is 
illustrated in the following example. 

Example 8.1 Partitioning of the phenotypic variance into its genotypic and environmental 
components has been done for several characters in Drosophila melanogaster. The results 
are given later, in Table 8.2, but here we may describe the results for one character 
in more detail in order to show how the partitioning is made. The character is the length 
of the thorax (in units of 1/100 mm), which may be regarded as a measure of body-size. 
The phenotypic variance was measured in a genetically mixed, i.e., a random-bred popula¬ 
tion, and in a genetically uniform population, consisting of the F, generation of three 
crosses between highly inbred lines. The first estimates the genotypic and environmental 
variance together, and the second estimates the environmental variance alone, as shown 
in the table. So, by subtraction, an estimate of the genotypic variance is obtained. This 
shows that 49 per cent of the total variation of thorax length in the genetically mixed 
population is attributable to genetic differences between individuals and 51 per cent to 
non-genetic differences. (Data from F.W. Robertson, 1957/?). 


Population 

Components 

Observed variance 

Mixed 

v G + y E 

0.366 

Uniform 

V E 

0.186 

Difference 

V G 

0.180 


w - 

0.180/0.366 = 49% 


The estimation of the genotypic variance in the manner described above is not 
quite as straightforward as it may seem. It rests on the assumption that the 
environmental variance is the same in all genotypes, and this is certainly not always 
true. The environmental variance measured in one inbred line or cross is that shown 
by this one particular genotype, and other genotypes may be more or less sensitive 
to environmental influences and may therefore show more or less environmental 
variance. The environmental variance of the mixed population may therefore not 
be the same as that measured in the genotypically uniform group. Furthermore, some 
characters have been found to be more variable among inbred, homozygous, 
individuals than among cross-bred, heterozygous, individuals, the homozygotes being 
more sensitive to environmental differences. Different sensitivities to the environment 
are an aspect of genotype-environment interaction which is discussed more fully 
later in this chapter. Because of possible differences of environmental sensitivity 
it is desirable to estimate the environmental variance from several different uniform 
groups, inbred lines and their crosses. For the ways of combining the separate 
estimates to get the most reliable mean estimate, see Wright (1968, p. 382). 

The difficulties arising from the use of unrepresentative genotypes can be over¬ 
come with plants that can be propagated clonally. It is then possible to have a large 
number of genotypes each represented by many clonally produced individuals. If 
the individuals are grown with genotypes randomized with respect to the environ¬ 
ment, an analysis of variance provides an estimate of the component of variance 
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between clones. The variance between clones is due mainly to differences of genotype, 
and can be regarded as an estimate of V G . But there may be environmental effects 
included in it. That is to say, some part of the environmental differences between 
individuals may be transmitted to all their clonal descendants. For this reason the 
ratio V G /V P estimated in this way is liable to be an overestimate. Strictly speaking, 
it should be called the ‘clonal repeatability’; the meaning of ‘repeatability’ is explained 
later in this chapter. 

* 

Genetic components of variance 

The partition into genotypic and environmental variance does not take us far toward 
an understanding of the genetic properties of a population, and in particular it does 
not reveal the cause of resemblance between relatives. The genotypic variance must 
be further divided according to the division of genotypic value into breeding value, 
dominance deviation, and interaction deviation. Thus we have: 


Values 

G 

= A 

+ D 

+ I 

Variance components 

Vg 

= Va 

+ v D 

+ V, 


(genotypic) 

(additive) 

(dominance) 

(interaction) 


The additive variance, which is the variance of breeding values, is the important 
component since, as already mentioned, it is the chief cause of resemblance between 
relatives and therefore the chief determinant of the observable genetic properties 
of the population and of the response of the population to selection. Moreover, it 
is the only component that can be readily estimated from observations made on the 
population. In practice, therefore, the important partition is into additive genetic 
variance versus all the rest, the rest being non-additive genetic and environmental 
variance. This partitioning yields the ratio V A /V P , which is the heritability of the 
character. 

Estimation of the additive variance rests on observation of the degree of 
resemblance between relatives, and will be described later when we have discussed 
the causes of resemblance between relatives. Our immediate concern here is to show 
how the genetic components of variance are influenced by the gene frequency. To 
do this we have to express the variance in terms of the gene frequency and the assigned 
genotypic values a and d. We shall consider first a single locus with two alleles, 
thus excluding interaction variance for the moment. 

Additive and dominance variance 

The information needed to obtain expressions for the variance of breeding values 
and the variance of dominance deviations was given in the last chapter in Table 7.3. 
This table gives the breeding values and dominance deviations of the three genotypes, 
expressed as deviations from the population mean. It will be remembered that the 
means of both breeding values and dominance deviations are zero. Therefore no 
correction for an assumed mean is needed, and the variance is simply the mean of 
the squared values. The variances are thus obtained by squaring the values in the 
table, multiplying by the frequency of the genotype concerned, and summing over 
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the three genotypes. The additive variance, which is the variance of breeding values, 
is obtained as follows: 

V A = 4p 2 q 2 a 2 + 2pq(q - p) 2 a 2 + 4p 2 q 2 a 2 
= 2pqa 2 {2pq + q 2 - 2pq + p 2 + 2pq) 

= 2pqct 2 {p 2 + 2pq + q 2 ) 

= 2 pqa 2 ... [8.3a] 

= 2 pq[a + d(q - p)] 2 ... [8.3 b] 

As in Table 7.3, q is here the frequency of the allele that reduces value. 
Similarly, the variance of dominance deviations is 

V D = d 2 (4q 4 p 2 + 8 p 3 q 3 + 4 p A q 2 ) 

= 4 p 2 q 2 d 2 (q 2 + 2pq + p 2 ) 

= (2 pqd) 2 ...[8.4] 

If there is no dominance at the locus under consideration (d = 0), the expression 
for the additive variance simplifies to 

V A = 2pqa 2 ... [8.5] 

where q is the frequency of the recessive allele. 

If there is complete dominance (d = a), the additive variance becomes 

V A = 8 pq 3 a 2 ... [8.6] 

With any degree of dominance, the expressions for both the additive and the 
dominance variances become much simpler if the frequencies of all segregating genes 
are one-half (p = q = 0.5), as they are in populations derived from a cross of two 
highly inbred lines. Equations [8.36] and [8.4] then reduce to 

} ■ [8 - 7] 

For a full account of the analysis of such populations, see Mather and Jinks (1977, 
1982). 

Total genetic variance 

The total genetic variance, V G , arising from one locus can be calculated directly 
from Table 7.3 in the same way as was done above for V A and V D . But to do so 
requires a lengthy algebraic reduction, and it is simpler to get V G from the values 
of V A and V D calculated above. Since G = A + D, the variance of G is given by 
v g ~ V A + V D + 2co\ ad , where cov^ is the covariance of breeding values with, 
dominance deviations. This covariance can be shown to be zero as follows. The 
breeding values, dominance deviations, and frequencies of the three genotypes were 
given in Table 7.3. Multiplying breeding value by dominance deviation by frequency, 
and summing over the three genotypes, gives the covariance as 

-4 p 2 q z ad + 4p 2 q 2 (q - p)ad + 4 p 3 q 2 ad = 4p 2 q 2 ad(-q + q-p+p) = 0 

Thus 


V g =V a + v d 

= 2pq[a + d(q - p)] 2 + [2 pqd] 2 


.. . [ 8 . 8 ] 
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Example 8.2 To illustrate the genetic components of variance arising from a single 
locus, let us return to the pygmy gene in mice, used for several examples in the last 
chapter. From the values tabulated in Example 7.6, we may compute the components 
of variance directly. Since the values are expressed as deviations from the population 
mean, the variance is obtained by multiplying the frequency of each genotype by the 
square of its value, and summing over the three genotypes. For example, the genotypic 
variance when q = 0.1 is 0.81(0.44) 2 + 0.18(-1.56) 2 = 0.01(-7.56) 2 = 1.1664. The 
additive variance is obtained in the same way from the variance of breeding values, and 
the dominance variance from the variance of dominance deviations. The variances ob¬ 
tained are as follows: 



q = 0.1 

q = 0.4 

Genotypic, V G 

1.1664 

7.1424 

Additive, V A 

1.0368 

6.2208 

Dominance, V 0 

0.1296 

0.9216 


The variances may be obtained also, and with less trouble, by use of the formulae given 
above in equations [8.3], [8.4], and [8.8], The values to be substituted were given in 
Example 7.1; namely, a = 4 and d = 2. Notice that the dominance variance is quite 
small in comparison with the additive. 

The ways in which the gene frequency and the degree of dominance influence 
the magnitude of the genetic components of variance can best be appreciated from 
graphical representations of the relationships derived above in equations [8.3], [8.4], 
and [8.8]. The graphs in Fig. 8.1 show the amounts of genotypic, additive, and 
dominance variance arising from a single locus with two alleles, plotted against the 
gene frequency. Three cases are shown to illustrate the effect of different degrees 
of dominance: in graph (a) there is no dominance (d = 0); in graph (b) there is 
complete dominance ( d = a); and in graph (c) there is ‘pure’ overdominance (a 
= 0). In the first case the genotypic variance is all additive, and it is greatest when 
p = q = 0.5. In the second case the dominance variance is maximal when p = 
q = 0.5, the additive variance is maximal when the frequency of the recessive allele 
is q = 0.75, and the genotypic variance is maximal when q 2 = 0.5, i.e., q ~ 0.71. 
In the third case the dominance variance is the same as in the second and is maximal 
when p = q = 0.5. The additive variance, however, is zero when p = q = 0.5, 
and has two maxima, one at q = 0.15 and the other at q = 0.85. The genotypic 
variance, in this case, remains practically constant over a wide range of gene fre¬ 
quency, though its composition changes profoundly. The general conclusion to be 
drawn from these graphs is that genes contribute much more variance when at 
intermediate frequencies than when at high or low frequencies: recessives at low 
frequency, in particular, contribute very little variance. 

The foregoing account of the genetic variance is mainly theoretical. In practice 
we are not concerned with gene frequencies or gene effects because these are not 
known except in specially constructed populations. In practice, therefore, we are 
concerned only with the estimation of the components. It should be noted, however, 
that all the components of genetic variance are dependent on the gene frequencies, 
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Gene frequency, q 

Fig. 8.1. Magnitude of the genetic components of variance arising from a single locus with two 
alleles, in relation to the gene frequency. Genotypic variance - thick lines; additive 
variance - thin lines; dominance variance - broken lines. The gene frequency, q, is that of the 
recessive allele. The degrees of dominance are: in (a) no dominance (d = 0); in ( b ) complete 
dominance (d = a ); and in (c) ‘pure’ overdominance (a = 0). The figures on the vertical scale, 
showing the amount of variance, are to be multiplied by a 2 in graphs (a) and ( b ), and by cP in 
graph (c). 

so any estimates of them are valid only for the population from which they are 
estimated. When observations on the resemblance between relatives are available, 
the additive genetic variance can be estimated. (The way in which this is done is 
the subject of the next two chapters.) The total phenotypic variance can then be par¬ 
titioned into V A : ( V D + F/ + V E ) , the non-additive genetic variance being included 
with the environmental variance. If inbred lines are available the environmental 
variance can be estimated and so the phenotypic variance can be partitioned into 
V G :V E . 

If both these partitions are made, we can separate the additive genetic from the 
rest of the genetic variance, and so make the three-fold partition into additive genetic, 
non-additive genetic, and environmental variance, V A :(V D + V[):V E , the 
dominance and interaction components being lumped together as non-additive genetic 
variance. Examples of this partitioning are given in Table 8.2. 

A possible misunderstanding about the concept of additive genetic variance, to 
which the terminology may give rise, should be mentioned here. The concept of 
additive variance does not carry with it the assumption of additive gene action; and 
the existence of additive variance is not an indication that any of the genes act 
additively (i.e., show neither dominance nor epistasis). No assumption is made about 
the mode of action of the genes concerned. Additive variance can arise from genes 
with any degree of dominance or epistasis, and only if we find that all the genotypic 
variance is additive can we conclude that the genes show neither dominance nor 
epistasis. 

The existence of more than two alleles at a locus introduces no new principle, 
though it complicates the theoretical description of the effect of the locus. Expres¬ 
sions for the additive and dominance variances are given by Kempthorne (1955a). 
The locus contributes additive variance arising from the average effects of its several 
alleles, and dominance variance arising from the several dominance deviations. 
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Table 8.2 Partitioning of the variance of four characters in Drosophila melanogaster. 
Components as percentages of the total, phenotypic, variance. 




Character 




(1) 

Bristles 

(2) 

Thorax 

(3) 

Ovary 

(4) 

Eggs 

Phenotypic 

v P 

100 

100 

100 

100 

Additive genetic 

Va 

52 

43 

30 

18 

Non-additive genetic 

Vd + V f 

9 

6 

35 

44 

Environmental 

v E 

39 

51 

35 

38 


Characters and sources of data: 

(1) Number of bristles on 4th + 5th abdominal segments (Clayton, Morris, and Robertson, 1957; 

Reeve and Robertson, 1954). 

(2) Length of thorax (Robertson, 1 951b). 

(3) Size of ovaries, i.e. number of ovarioles in both ovaries. (Robertson, 1957a). 

(4) Number of eggs laid in 4 days (4th to 8th after emergence) (Robertson, 1957£>). 

To arrive at the variance components expressed in the population, the separate 
effects of all loci that contribute variance have to be combined. When a random¬ 
mating population is in equilibrium, the additive variance arising from all loci together 
is the sum of the additive variances attributable to each locus separately; and the 
dominance variance is similarly the sum of the separate contributions. But when 
more than one locus is under consideration then the interaction deviations, if pre¬ 
sent, give rise to another component of variance, the interaction variance, which 
is the variance of the interaction deviations. 

Interaction variance 

If the genotypes at different loci show epistatic interaction, in the manner described 
in the previous chapter, then the interactions give rise to a component of variance 
V h which is the variance of the interaction deviations. Theoretical description of 
the properties of interaction variance rests on its further subdivision into components. 
It is first subdivided according to the number of loci involved: two-factor interaction 
arises from the interaction of two loci, three-factor from three loci, etc. Interactions 
involving larger numbers of loci contribute so little variance that they can be ig¬ 
nored, and we shall confine our attention to two-factor interactions since these suf¬ 
fice to illustrate the principles involved. The next subdivision of the interaction 
variance is according to whether the interaction involves breeding values or dominance 
deviations. There are thus three sorts of two-factor interactions. Interaction between 
the two breeding values gives rise to additive X additive variance, Vaa> interaction 
between the breeding value of one locus and the dominance deviation of the other 
gives rise to additive X dominance variance, Vad\ and interaction between the two 
dominance deviations gives rise to dominance x dominance variance, V DD . So the 
interaction variance is broken down into components thus: 

Vi =V AA + V AD + V DD + etc. ... [8.9] 

the terms designated ‘etc.’ being similar components arising from interactions between 
more than two loci. 
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There is no doubt that interaction between loci controlling quantitative characters 
is a frequent occurrence: it has been demonstrated in many studies of Drosophila, 
for example (see Kearsey and Kojima, 1967). It is not easy, however, to estimate 
the amount of variance that it generates, and little is known about its relative impor¬ 
tance as a source of variation. The experimental evidence is reviewed by Barker 
(1979). For further details of epistatic interaction, see Cockerham (1954, 1963), 
Kempthorne (1957), Crow and Kimura (1970). 

In the partitioning of the variance by relatively simple experiments, such as are 
considered here, most of the interaction variance is included with the dominance 
component, which is then referred to as non-additive genetic variance. This is as 
far as we can go here in the description of the interaction variance, but we shall 
see in the next chapter how it contributes to the resemblance between relatives. 

Variance due to disequilibrium 

There is one additional source of genetic variance that must be mentioned at this 
point, although it will concern us only at a few places in later chapters. It arises 
when a population is not in equilibrium under random mating. Disequilibrium exists 
when the genotype frequencies at two or more loci considered jointly are not what 
would be expected from the gene frequencies. The disequilibrium introduces an ad¬ 
ditional source of genetic variance for the following reason. For simplicity, con¬ 
sider just two loci which do not interact in the manner described above. Let G' 
and G" be genotypic values of individuals with respect to each locus separately, 
and let G be the genotypic value with respect to both jointly, i.e., G = G' + G ". 
The total genotypic variance caused by the two loci together is then 

Vg = ^G' + V G » + 2co \ G 'g" • ■ • [8.10] 

The covariance term represents correlation between the genotypic values at the two 
loci in different individuals. The correlation can be positive or negative, so dis¬ 
equilibrium can either increase or decrease the variance. When more than two loci 
are to be considered, there will be a covariance term for each pair of loci. When 
there is no disequilibrium, all the covariance terms are zero and the variance is as 
described in the previous sections. 

There are two forms of non-random mating that generate disequilibrium, and they 
differ in the way they produce the covariance in equation [8.10]. The first occurs 
when parents are not a random sample of the individuals in their generation. Selec¬ 
tion of parents, which is the subject of later chapters, constitutes non-random mating 
of this sort. The second form of non-random mating is assortative mating, as described 
in Chapter 1. The two sorts of covariance produced represent different correlations 
of gene effects. First, there is a correlation between genes at different loci in the 
same gamete. This is gametic phase, or linkage, disequilibrium, which was explained 
in Chapter 1. The second sort of covariance represents correlation between the genes 
in uniting pairs of gametes, i.e., between the genes an individual receives from its 
two parents. The first form of non-random mating alone, i.e. selection of parents 
which are then random-mated, generates the first sort of covariance, that due to 
gametic phase disequilibrium. The second form of non-random mating, i.e., assor¬ 
tative mating, generates both sorts of covariance. If the source of the disequilibrium 
ceases to operate, the covariance that is not due to gametic phase disequilibrium 
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disappears immediately. The gametic phase disequilibrium of unlinked loci is halved 
in each subsequent generation, but with linked loci it persists for longer. 

Correlation and interaction between genotype and environment 

Two complications arise in connection with the partitioning of the variance into 
genotypic and environmental components as expressed in equation [8.1], These can 
both normally be neglected without seriously affecting the conclusions drawn from 
partitioning die variance, but it is important to know what the consequences of neglect¬ 
ing them are. 

Correlation 

In the foregoing account of the variance components it has been assumed that 
environmental deviations and genotypic values are independent of each other; in 
other words, that there is no correlation between genotypic value and environmen¬ 
tal deviation, such as would arise if the better genotypes were given better en¬ 
vironments. Correlation between genotype and environment is seldom an important 
complication, and can usually be neglected in experimental populations, where ran¬ 
domization of environment is one of the chief objects of experimental design. There 
are some situations, however, in which the correlation exists. Milk-yield in dairy 
cattle provides an example. The normal practice of dairy husbandry is to feed cows 
according to their yield, the better phenotypes being given more food. This introduces 
a correlation between phenotypic value and environmental deviation; and, since 
genotypic and phenotypic values are correlated, there is also a correlation between 
genotypic value and environmental deviation. Another example is human in¬ 
telligence. The phenotypic values of the parents affect the environment in which 
the children grow up; so, to the extent that intelligence is inherited, this introduces 
a correlation between the genotype and the environment of the children. Equation 
[8.1a] is true only if environmental deviations and genotypic values are uncorrelated. 
When a correlation is present the phenotypic variance is increased by twice the 
covariance of genotypic values and environmental deviations, and equation [8.1a] 
becomes 


Vp — Eg 4- V E + 2cov GE ... [8.11] 

The only way by which cov GE could be measured is if we estimated V E directly as 
in Example 8.1, and also estimated V G directly by the variance of varying genotypes 
in a constant environment. Then subtraction of the directly estimated V G and V E 
from V P in equation [8.11] would yield an estimate of 2co\ GE . This, however, could 
only be done if inbred lines were available and if the non-random aspects of the 
environment, such as feeding levels, could be identified. The covariance, being in 
practice unknown, is best regarded as part of the genetic variance because the non- 
random aspects of the environment are a consequence of the genotypic value and 
so an individual’s environment can be thought of as part of its genotype. This is 
not unreasonable with cows’ milk-yield. It is less satisfactory with human intelligence 
because the environmental effects on the children are not a consequence of their 
own genotypes but of their parents’ genotypes. 
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Interaction 

Another assumption that has been made, which is not always justifiable, is that a 
specific difference of environment has the same effect on different genotypes; or, 
in other words, that we can associate a certain environmental deviation with a specific 
difference of environment, irrespective of the genotype on which it acts. When this 
is not so there is an interaction, in the statistical sense, between genotypes and 
environments. There are several forms which this interaction may take. For exam¬ 
ple, a specific difference of environment may have a greater effect on some genotypes 
than on others; or there may be a change in the order of merit of a series of genotypes 
when measured under different environments. That is to say, genotype A may be 
superior to genotype B in environment X, but inferior in environment Y. 

When interaction between genotypes and environments is present, the phenotypic 
value of an individual is not simply P = G + E, as in equation [7.1], but includes 
also an interaction component: P = G + E + 1 GE . The interaction component 
gives rise to an additional source of variation and equation [8.1a] becomes V P = 
Vg + Ve + Vge- 

In an experiment of the sort illustrated in Example 8.1 the genetically uniform 
group is a single genotype. Its variance is due entirely to environmental differences 
among individuals, and depends on the way in which the particular genotype responds 
to the environmental differences. Therefore the variance due to interaction is 
included with the environmental variance estimated from the phenotypic variance 
of that genotype. Some genotypes, as already noted, may be more sensitive than 
others to environmental differences. So, to some extent, the environmental variance 
is a property of the genotype. But the source of the variation is environmental and 
not genetic. It is therefore logical, as well as experimentally necessary, to regard 
any variance due to genotype-environment interaction as being part of the environ¬ 
mental variance included in any estimate of V E . 

Genotype-environment interaction becomes very important if individuals of a par¬ 
ticular population are to be reared under different conditions. For example, a breed 
of livestock may be used by different farmers who treat it differently; and varieties 
of plants are grown in different seasons, at different places, and under different con¬ 
ditions. This situation is different in one respect from what we have been consider¬ 
ing hitherto. The different farms, seasons, or locations are ‘specific environments’, 
shared by all the individuals in them, and are more in the nature of ‘treatments’. 
In the situation considered hitherto, each individual has its own particular environ¬ 
ment, and individuals cannot be grouped according to any particular aspect of their 
environments, such as nutrition, temperature, or crowding. When individuals are 
reared in specific environments the genotype-environment interaction can be studied 
in more detail. If genotypes can be replicated, and more than one individual of each 
of several genotypes are reared in different specific environments, then an analysis 
of variance in a two-way classification of genotypes X environments will yield 
estimates of the variance between genotypes, the variance between the specific 
environments, and the variance attributable to interaction of genotypes with 
environments. If there is no interaction, then the best genotype in one environment 
will be the best in all. But if there is much interaction then particular genotypes 
must be sought for particular environments. The specialization of breeds or varieties 
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for specific environments will be taken up again later, in Chapter 19, because it 
can be discussed more usefully from a different viewpoint. We shall next consider 
the idea of environmental sensitivity and how it can be measured. 

Environmental sensitivity. Some of the genotype-environment interaction can be 
ascribed to differences of sensitivity of different genotypes. In other words, a given 
environmental difference has more effect on some genotypes than it has on others. 
To measure environmental sensitivities, and to see how much of the interaction 
variance is ascribable to differences of sensitivity, different genotypes are reared 
or grown in a range of specific environments. The specific environments have to 
be quantified as more or less favourable for expression of the character under study. 
The only way in which environments can be quantified is by the mean performance 
of all the genotypes. In other words, the measure of an environment is the mean 
of all genotypes in that environment. This will be called the environmental value. 
Each genotype has its own mean value in each of the specific environments. The 
genotype’s environmental sensitivity is then the regression of its own value on the 
environmental value. The procedure will be made clearer by Example 8.3 below. 
The variance due to interaction of genotypes with the specific environments is 
estimated from an analysis of variance, and the amount attributable to differences 
of sensitivity is obtained from the heterogeneity of regression slopes. For details, 
see Perkins and Jinks (1968); and for another example, see Zuberi and Gale (1976). 

Example 8.3 (From data kindly supplied by Professor J.L. Jinks ) Ten inbred lines of 
the tobacco plant Nicotiana rustica were grown in each of eight specific environments 
created by different dates of sowing and different densities of planting. The final heights 
of 8 plants of each line in each environment were measured. An analysis of variance 
showed that the differences between the genotypes (lines) and between the environments 
were significant and that there was a significant genotype x environment interaction. 
The environmental sensitivities of four of the genotypes are depicted in Fig. 8.2. To 
estimate the environmental sensitivity of a genotype, the general effect of each environ¬ 
ment is first evaluated as the mean of all 10 genotypes in that environment. Then the 
value of each genotype is plotted against the environmental mean. The slope of the re¬ 
gression line measures the environmental sensitivity of the genotype. A regression coef¬ 
ficient of 1.0 represents the average sensitivity of all genotypes. The sensitivities of the 
four genotypes in the graphs are shown at the right-hand margin. Of particular interest 
are the two genotypes in the middle, which had the highest and lowest sensitivities of 
the 10 lines. They had nearly equal means over all environments, but in consequence 
of their different sensitivities one was taller in good environments and the other was 
taller in poor environments, a reversal of the order of merit. (For further details see 
Mather and Jinks, 1982, p. 118.) 

Environmental variance 

Environmental variance, which by definition embraces all variation of non-genetic 
origin, can have a great variety of causes and its nature depends very much on the 
character and the organism studied. Generally speaking, environmental variance is 
a source of error that reduces precision in genetic studies and the aim of the 
experimenter or breeder is therefore to reduce it as much as possible by careful 
management or proper design of experiments. Nutritional and climatic factors are 
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Environmental value 


Fig. 8.2. Plant height (cm) of Nicotiana rustica genotypes grown in eight specific environments 
as explained in Example 8.3. (These are genotypes numbered 3, 6, 7, 10 in Tables 42 and 44 of 
Mather and Jinks, 1982.) 

the commonest external causes of environmental variation, and they are at least partly 
under experimental control. Maternal effects form another source of environmental 
variation that is sometimes important, particularly in mammals, but is less suscepti¬ 
ble to control. Maternal effects are prenatal and postnatal influences, mainly nutri¬ 
tional, of the mother on her young: we shall have more to say about them in the 
next chapter in connection with resemblance between relatives. Error of measure¬ 
ment is another source of variation, though it is usually quite trivial. When a character 
can be measured in units of length or weight it is usually measured so accurately 
that the variance attributable to measurement is negligible in comparison with the 
rest of the variance. Some characters, however, cannot strictly speaking be measured, 
but have to be graded by judgement into classes. Carcass qualities of livestock are 
an example. With such characters the variance due to measurement may be 
considerable. 

In addition to the variation arising from recognizable causes, such as those men¬ 
tioned, there is usually also a substantial amount of non-genetic variation whose cause 
is unknown, and which therefore cannot be eliminated by experimental design. This 
is generally referred to as ‘intangible’ variation. Some of the intangible variation 
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may be caused by ‘environmental’ circumstances, in the common meaning of the 
word - that is, by circumstances external to the individual - even though their 
nature is not known. Some, however, may arise from ‘developmental’ variation- 
variation, that is, which cannot be attributed to external circumstances but is 
attributed, in ignorance of its exact nature, to ‘accidents’ or ‘errors’ of development 
as a general cause. Characters whose intangible variation is predominantly 
developmental are those connected with anatomical structure, which do not change 
after development is complete, such as skeletal form, pigmentation, or bristle number 
in Drosophila. Characters more susceptible to the influences of the external environ¬ 
ment, m contrast, are those connected with metabolic processes, such as growth 
fertility, and lactation. 

Example 8.4 Human birth weight provides an example of a character subject to much 
environmental variation whose nature has been analysed in detail (Penrose, 1954; Robson, 
1955). The partitioning of the phenotypic variance given in the table shows the relative 
importance of all the identified sources of variation, birth weight being regarded as a 
character of the child. All the environmental variation is ‘maternal’ in the sense that 
it is connected with the prenatal environment, but several distinct components of the 
maternal environment are distinguished. ‘Maternal genotype’, which accounts for 20 
per cent of the total phenotypic variance, reflects genetic variation (chiefly additive) be¬ 
tween mothers in the birth weight of their children; i.e., birth weight regarded as a 
character of the mother. ‘Maternal environment, general’, which accounts for another 
18 per cent, reflects non-genetic variation between mothers in the same way. These two 
components, totalling 38 per cent, are maternal causes of variation in birth weight that 
affect all children of the same mother alike. ‘Maternal environment, immediate’ means 
causes attributable to the mother but differing in successive pregnancies. Two causes 


Partitioning of variance of human birth-weight. Components as percentages of the total, 
phenotypic, variance. 


Cause of variation 

% of total 

Genetic 


Additive 

15 

Non-additive (approx) 

1 

9 

Sex 


Total genotypic 

18 

Environmental 

Maternal genotype 

20 

Maternal environment, general 

18 

Maternal environment, immediate 

6 

Age of mother 

1 

7 

Parity 

Intangible 

30 

-—-- 


Total environmental 


82 
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of the same nature — ‘Age of mother’ and ‘Parity’ (i.e., whether the child is the first, 
second, etc.) — are separately identifiable. Finally, the ‘Intangible’ variation is all the 
remainder, of which the cause cannot be identified. To explain how these various com¬ 
ponents were estimated would take too much space, and could not properly be done until 
the end of Chapter 10. It must suffice to say that the estimates all come from comparisons 
of the degree of resemblance between identical twins, fraternal twins, full sibs, children 
of sisters, and other sorts of cousins. It should be noted that the estimates are not very 
precise and must not be taken as definitive parameters of human populations. In another 
study (Morton, 1955) with different data, no effect of genetic factors in the foetus was 
found, all the variation of birth weight being environmental. 


Multiple measurements: repeatability 

When more than one measurement of the character can be made on each individual, 
the phenotypic variance can be partitioned into variance within individuals and 
variance between individuals. This partitioning leads to a ratio of variance com¬ 
ponents called the repeatability which has three main uses: to show how much is 
to be gained by the repetition of measurements, to set upper limits to the ratios 
V G /V P or V A /V P , and to predict future performance from past records. It may also 
throw light on the nature of the environmental variance. The partitioning of the 
variance corresponding to the repeatability is not a part of genetic theory, because 
it is the environmental, not the genetic, variance that is partitioned. It does, however, 
have some practical implications for genetical analysis and breeding programmes, 
as we shall see. 

There are two ways by which the repetition of a character may provide multiple 
measurements: by temporal repetition and by spatial repetition. Milk-yield and lit¬ 
ter size are examples of characters repeated in time. Milk-yield can be measured 
in successive lactations, and litter size in successive pregnancies. Several 
measurements of each individual can thus be obtained. The variance of yield per 
lactation, or of the number of young per litter, can then be analysed into a compo¬ 
nent within individuals, measuring the differences between the performance of the 
same individual, and a component between individuals, measuring the permanent 
differences between individuals. The within-individual component is entirely 
environmental in origin, caused by temporary differences of environment between 
successive performances. The between-individual component is partly environmen¬ 
tal and partly genetic, the environmental part being caused by circumstances that 
affect the individuals permanently. By this analysis, therefore, the variance due to 
temporary environmental circumstances is separated from the rest, and can be 
measured. 

Characters repeated in space are chiefly structural or anatomical, and are found 
more often in plants than in animals. For example, plants that bear more than one 
fruit yield more than one measurement of any character of the fruit, such as its shape 
or seed content. Spatial repetition in animals is chiefly found in characters that can 
be measured on the two sides of the body or on serially repeated parts, such as the 
number of bristles on the abdominal segments of Drosophila. With spatially repeated 
characters the within-individual variance is again entirely environmental in origin 
but, unlike that of temporally repeated characters, it represents the ‘developmental’ 
variation arising from localized circumstances operating during development. 
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In order that we may discuss both temporal and spatial repetition together, we 
shall use the terms special environmental variance, V Es , to refer to the within- 
individual variance arising from temporary or localized circumstances; and general 
environmental variance, V Eg , to refer to the environmental variance contributing 
to the between-individual component and arising from permanent or non-localized 
circumstances. The ratio of the between-individual component to the total phenotypic 
variance is the intraclass correlation r. It is the correlation between repeated 
measurements of the same individual, and is known as the repeatability of the 
character. The partitioning of the phenotypic variance expressed by the repeatability 
is thus into two components, V Es versus (V G + V Eg ), so that the repeatability is 


r 


V G + Ve K 

Vp 


... [ 8 . 12 ] 


The repeatability therefore expresses the proportion of the variance of single 
measurements that is due to permanent, or non-localized, differences between 
individuals, both genetic and environmental. It allows the separate estimation of the 
component V Es due to the special environment which, as a proportion of the total, 
is given by 


-— = l~ r ... [8.13] 

v p 

From equation [8.12] it can be seen that the repeatability sets an upper limit to 
the degree of genetic determination Vq/V p , and to the heritability V A /V P . The 
repeatability is usually much easier to determine than either of these two ratios and 
it may often be known when they are not. The heritability, which is the ratio of 
practical importance, may of course be much less than the repeatability, but it can¬ 
not be greater, and this knowledge is better than no knowledge at all of the heritability. 
The repeatability differs very much according to the nature of the character, and 
also, of course, according to the genetic properties of the population and the 
environmental conditions under which the individuals are kept. The estimates in Table 
8.3 give some idea of the sort of values that may be found with various characters, 
and two cases are described in more detail in Example 8.5. 


Table 8.3 Some examples of repeatability 


Repeatability 


Drosophila melanogaster: 

Abdominal bristle number (see Example 8.5) 0.42 

Ovary size (see Table 8.4) 0.54 

Mouse : (original data) 

Weight at 6 weeks 0.96 

Litter size in 1st and 2nd litters 0.45 

Sheep: (Morley, 1951) 

Weight of fleece in different years 0.74 

Cattle (British Friesians): (Barker and Robertson, 1966) 

Milk yield in 1st and 2nd lactations 0.40 

Percent fat in 1st and 2nd lactations 0.67 
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There are two assumptions implicit in the idea of repeatability. The first is that 
the variances of the different measurements are equal, and have their components 
in the same proportions. The second is that the different measurements reflect what 
is genetically the same character — a point that will be explained in Chapter 19. 
If these assumptions are not valid, the repeatability becomes a somewhat vague con¬ 
cept, without precise meaning in relation to the components of variance. Some of 
the characters in Table 8.3 do not conform strictly with the assumptions. 

• 

Example 8.5 The number of bristles on the ventral surfaces of the abdominal segments 
is a character that has been much studied in Drosophila melanogaster , because it is 
technically convenient and its genetic properties are relatively simple. We have already 
mentioned it several times but have not yet used it as an example. There are about 20 
bristles on each of 3 segments in males, and each of 4 segments in females. The number 
of bristles per segment can therefore be treated as a spatially repeated character. The 
sources of variation in this character have been studied in detail by Reeve and Robertson 
(1954), and the components of variance found are given in the table. 




cr <y 

9 9 

Total phenotypic 

v P 

4.24 

5.44 

Between flies 

v G + y Eg 

1.82 

2.19 

Within flies 

Ves 

2.42 

3.25 

Repeatability 


0.429 

0.403 


Estimation of the repeatability of a character separates off the component of variance 
due to special environment, V Es , but it leaves the other component of environmen¬ 
tal variance — that due to general environment, V E g — confounded with the 
genotypic variance, as shown in the above example. To complete the partitioning 
we need to separate V E g from V G . This can be done in two ways: either by 
estimating the genotypic variance V G , in the manner of Example 8.1, or by 
calculating the repeatability in a genetically uniform group such as an inbred line 
or F, cross. Where there is no genetic variation, the between-individual component 
of variance consists only of the general environment component and the 
repeatability measures the ratio V Eg /(V Eg + V Es ). The environmental variance has 
been partitioned in this way for two characters in Drosophila and the full partition¬ 
ing that this leads to is shown in Table 8.4. The main point of interest is the very 
small amount of variation arising from the general environment. These characters 
are therefore very little influenced by the external environment; or perhaps it would 
be more accurate to say that the technique of rearing the flies has been very suc¬ 
cessful in eliminating unwanted sources of environmental variation. Under the con¬ 
ditions of the experiment, virtually all the non-genetic variation is due to strictly 
localized causes that influence the segments or the ovaries independently. Because 
the V E g component is so small, the repeatability gives a good estimate of the degree 
of genetic determination, V G /V P . Furthermore, the non-additive genetic variance of 
bristle-number is small, so the repeatability is not very different from the heritability, 

v A /v P . 



142 


8 Variance 


Table 8.4 Partitioning of the phenotypic variance of two characters in Drosophila 
melanogaster. Each component is given as a percentage of the total variance of single 
measurements. 


Component 


(1) 

Bristle number 

(2) 

Ovary size 

Additive genetic, 

Va 

33 

23 

Non-additive genetic, 

v NA 

6 

27 

General environment, 

V E g 

3 

4 

Special environment, 

V Es 

58 

46 

Total, phenotypic. 

v P 

100 

100 


Characters and sources of data: 

(1) Counts on one abdominal segment (Reeve and Robertson, 1954). The results for males and 
females were calculated separately and then averaged. 

(2) Number of ovarioles in one ovary (Robertson, 1957a). 

The proportions of the genetic components here are lower than those in Table 8.2 because Table 

8.2 refers to the variance of the sum of two measurements; see Example 8.6. 

Gain from multiple measurements. One way in which knowledge of the repeatability 
is useful is to indicate the gain in accuracy expected from multiple measurements. 
If the repeatability is high, little will be gained; if it is low, more will be gained. 
The question is: how is the gain in accuracy related to the repeatability? The only 
component of variance that is reduced by repeated measurements is that due to the 
special environment, V Es , and the amount by which it is reduced depends on the 
number of measurements made. Suppose that each individual is measured n times, 
and that the mean of these n measurements is taken to be the phenotypic value of 
the individual, say P(„). Then the phenotypic variance is made up of the genotypic 
variance, the general environmental variance, and one nth of the special environmental 
variance: 

Vp( n ) — Vq + V Eg + ~ Ves • • • [8.14] 

n 

Thus, increasing the number of measurements reduces the amount of variance due 
to special environment that appears in the phenotypic variance, and this reduction 
of the phenotypic variance represents the gain in accuracy. The variance of the mean 
of n measurements as a proportion of the variance of one measurement can be 
expressed in terms of the repeatability, as follows. Writing the components in terms 
of r and 1 - r, from equations [8.12] and [8.13], and substituting into equation [8.14], 
gives 

V P(n) = (r --—^ Vp 

and rearrangement leads to 

Vp(n) __ 1 + r(n - 1) 

Vp 


n 


... [8.15] 
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Fig. 8.3. Gain in accuracy from multiple measurements of each individual. The vertical scale 
gives the variance of the mean of n measurements as a percentage of the variance of one 
measurement. The horizontal scale gives the number of measurements, up to 10. The four 
graphs refer to characters of different repeatability as indicated. 


This ratio is plotted in Fig. 8.3 to show how the phenotypic variance is reduced 
by multiple measurements, with characters of different repeatabilities. When the 
repeatability is high, and there is therefore little special environmental variance, multi¬ 
ple measurements give little gain in accuracy. When the repeatability is low, multi¬ 
ple measurements may lead to a worthwhile gain in accuracy. The gain in accuracy, 
however, falls off rapidly as the number of measuremens increases, and it is seldom 
worth while to make more than two or three measurements. In practice it does not 
make any difference whether one works with the mean or with the sum of the 
measurements: though the actual variances will be different, the relative magnitudes 
of the components are not affected. 

Example 8.6 Most of the studies of abdominal bristle number in Drosophila have been 
based on counts of the bristles on two segments. The table shows how the percentage 
composition of the variance is affected by counting two segments instead of one. Col¬ 
umn (1) gives the percentage composition of the phenotypic variance when only one 
segment is counted, as in Table 8.4. If two segments are counted and the mean of the 
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two counts is taken as the phenotype of the individual, V& is halved but the other com¬ 
ponents are unaltered, giving the figures in column (2). The total variance V P is now 
reduced to 71. Dividing each component by 71 and multiplying by 100 gives, in column 
(3), the percentage composition of the variance when phenotypic values are based on 
counts of two segments. The point of practical importance is that the additive genetic 
variance has been increased from 33 to 46 per cent. (The reason why V A is 46 per cent 
and not 52 per cent as in Table 8.2 is that the two estimates are derived from different 
strains.) 


• 

One segment 

Two segments 

(1) 

(2) 

(3) 

Va 

33 

33 

46.4 

V NA 

6 

6 

8.2 

V E g 

3 

3 

4.1 

Ves 

58 

29 

41.3 

Vp 

100 

71 

100.0 


The advantage for breeding programmes from the gain in accuracy is the increased 
proportion of additive genetic variance. That is to say, the mean of two or more 
measurements has a higher heritability than does a single measurement and is therefore 
a better guide to an individual’s breeding value. This increase of heritability, however, 
cannot be relied on unless the two assumptions mentioned earlier are valid, namely 
that the different measurements have equal variances and represent the same character 
genetically. These conditions are met by the number of bristles on the abdominal 
segments of Drosophila , and the conclusions reached in Example 8.6 are valid. A 
character for which the assumptions do not hold is milk-yield of cows in successive 
lactations (Rendel et al ., 1957). In this case the proportion of additive genetic variance 
is actually less for the mean of several lactations than it is for first lactations only. 

Prediction of future performance. The prediction of future performance is a prob¬ 
lem that occurs in many contexts. It has no genetical connotation but rests on the 
partitioning of the variance into components due to permanent and temporary effects, 
i.e., the partitioning made by the repeatability. Performances, both past and future, 
must be thought of in terms of deviations from the population means, past and future. 
A good past performance is partly due to the temporary environmental effects on 
the individual and these are not carried through to the subsequent performance, so 
the future performance tends to ‘regress’ toward the population mean. No predic¬ 
tion can be made without a knowledge of the characteristics of the population with 
respect to the two performances, for example milk-yield in first and second lacta¬ 
tions. The repeatability, which is the correlation between the two performances, 
tells us how accurately we can predict the second from a knowledge of the first. 
The prediction itself is made from the regression coefficient of second on first per¬ 
formance. If x and y are first and second performances respectively, x and y are 
the population means, and b is the regression coefficient of y on x, then the predic¬ 
tion is given by (y — y) = b(x — x). The relationship between the regression and 
correlation coefficients is b = r<jy/a x , where o x and o y are the standard deviations. 
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Example 8.7 The prediction of future performance can be illustrated from the data 
of Barker and Robertson (1966) on the milk-yield of British Friesian cows. The data 
refer to 3,764 cows with records of yields in first, second, and third lactations. The means 
and standard deviations are given in the table, yields being here converted to kilograms. 
Both mean and standard deviation increased in successive lactations. Let us predict the 
mean yield in second and third lactations of a heifer with a yield of 5,000 kg in her first 
lactation. The repeatabilities required are the correlations of second with first and of 
third with first; both were 0.40. The regression, for example of second on first, is then 
, calculated as b = ra 2 /o x . The predicted yield Jn the second lactation is obtained from 
y — y= b(x — x) where y is the prediction, y is the mean yield in second lactations, 
x is observed yield of 5,000 kg and x is the mean yield in first lactations. The calcula¬ 
tions for second and third lactations and the predicted mean are set out in the table. 



Lactation 



1j/ 2nd 

3rd 

Mean, kg 

Standard deviation (a) 

Correlation with 1st (r) 

Regression on 1st ( b) 

4,096 4,232 

696 934 

— 0.40 

- 0.536 

4,731 

960 

0.40 

0.552 

Observed yield in 1st lactation = 5,000 
Deviation from means = +904 

Predicted yield in 2nd = 4,232 + (0.536 
Predicted yield in 3rd = 4,731 + (0.552 

x 904) = 4,716.5 
x 904) = 5,230.0 


Predicted mean in 2nd and 3rd 

4,973 



Summary of variance partitioning 

This chapter has shown how the phenotypic variance of a genetically variable popula¬ 
tion can be partitioned into four components, two genetic and two environmental. 
The data needed to do this are of three different kinds, each making a partition into 
two parts, but in different ways. Table 8.5 summarizes the different partitions that 
can be made. 


Table 8.5 Summary of variance partitioning. 


Data needed 

Partition made 

Ratio estimated 

Resemblance between relatives 

iV A ):{V m +V Eg +V Es ) 

heritability, V A /V P 

Genetically uniform group 

= (Vo) ■■ <v E ) 

degree of genetic 
determination, 

Multiple measurements 

( V G +V Eg ):V Es 

repeatability 

{V c + v Eg )iv P 

All three 

V A :V NA :V Eg :V Es 
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Problems 

8.1 The variances of leaf number in the F! and F 2 generations of a cross of tobacco 
varieties were calculated in Problem 6.1. The variances were 1.46 in the F, and 
5.97 in the F 2 . Estimate the degree of genetic determination in the F 2 generation. 
What assumptions have to be made to do this? 

[Solution 7] 

8.2 Calculate the amounts of additive genetic and dominance variance arising from 
the genes referred to in the population specified in (1) Problems 7.2 and 7.5 (all 
three populations), (2) Problems 7.3 and 7.7, (3) Problems 7.4 and 7.8. 

[Solution 17] 

8.3 Work out the proportion of the total genetic variance that is due to dominance, 
i.e. Vq/Vq, when the variance is caused by a single locus with the following degrees 
of dominance. (1) d = k a, i.e- incomplete dominance, (2) d = a, i.e. complete 
dominance, and (3) d = la, i.e. overdominance. The ratio V D /V G depends on the 
gene frequency. Plot graphs to show this relationship in each case and find from 
the graphs the approximate maximum value of the ratio and the gene frequency at 
which this occurs. 

[Solution 27] 

8.4 Refer to Problem 7.9 and its solution. Work out the three components of genetic 
variance, i.e. V A , V D and V h attributable to the two genes in the population 
specified. Express each component as a percentage of the total genetic variance, V G . 

[Solution 37] 

8.5 A sample of 10 female mice from a random-bred strain had the following 
numbers of live young born in their first and second litters. 


Mouse 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

1st litter 

11 

9 

13 

10 

9 

8 

10 

11 

10 

13 

2nd litter 

10 

12 

12 

10 

8 

6 

12 

9 

12 

12 


(1) From this sample calculate the repeatability of litter size. 

(2) What would you predict as being the expected size of the second litters of other 
mice from the same strain which had first litters of (a) 14 and (b) 5? 

[Solution 47] 

8.6 In a study of the fertility of pigs the litter sizes of 156 sows each of which 
had 10 litters were subjected to an analysis of variance with the following result. 
(The item ‘Litter order’ refers to differences between means of litters of the same 
order, first, second, third, etc.) 
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Source 

d.f 

Mean square 

Between sows 

155 

25.56 

Litter order 

9 

93.95 

Within sows 

1.395 

3.23 


(1) Calculate the repeatability of litter size. 

(2) Suppose that you are planning a breeding programme for which a high heritability 
(V A /V P ) is desirable. The fertility of individual sows can be measured by one 
litter, or with greater precision by the mean of several litters. By how much 
would the heritability be increased if fertility were measured as the mean of 
the first 2 litters, the first 3, and the first 4 litters? Assume that the repeatability 
is the same as that estimated from 10 litters. 

Data from Olbrycht, T.M. (1943) J. Agric. Sci., 33, 28-84. 


[Solution 57] 



RESEMBLANCE BETWEEN RELATIVES 


The resemblance between relatives is one of the basic genetic phenomena displayed 
by metric characters, and the degree of resemblance is a property of the character 
that can be determined by relatively simple measurements made on the population 
without special experimental techniques. The degree of resemblance provides the 
means of estimating the amount of additive genetic variance, and it is the propor¬ 
tionate amount of additive variance (i.e., the heritability) that chiefly determines 
the best breeding method to be used for improvement. An understanding of the causes 
of resemblance between relatives is therefore fundamental to the practical study of 
metric characters and to its application in animal and plant improvement. In this 
chapter, therefore, we shall examine the causes of resemblance between relatives, 
and show in principle how the amount of additive variance can be estimated from 
the observed degree of resemblance, leaving the more practical aspects of the estima¬ 
tion of the heritability for consideration in the next chapter. 

In the last chapter we saw how the phenotypic variance can be partitioned into 
components attributable to different causes. These components we shall call causal 
components of variance, and denote them as before by the symbol V. The measure¬ 
ment of the degree of resemblance between relatives rests on the partitioning of the 
phenotypic variance in a different way, into components corresponding to the group¬ 
ing of the individuals into families. These components can be estimated directly from 
the phenotypic values and for this reason we shall call them observational components 
of phenotypic variance, and denote them by the symbol o 2 in order to keep the 
distinction clear. Consider, for example, the grouping of individuals into families 
of full sibs. By the analysis of variance we can partition the total observed variance 
into two components, between (or among) groups and within groups. The between- 
group component is the variance of the ‘true’ means of the groups about the popula¬ 
tion mean, and the within-group component is the variance of individuals about the 
true mean of their group. The true mean of a group is the mean that would be found 
if it were estimated without error from a very large number of individuals. An 
explanation of the estimation of these two components will be given, with examples, 
in the next chapter. Now, the resemblance between related individuals, i.e., between 
full sibs in the case under discussion, can be looked at either as similarity of in¬ 
dividuals in the same group, or as difference between individuals in different groups. 
The greater the similarity within the groups, the greater will be the difference be¬ 
tween the groups. The degree of resemblance can therefore be expressed as the 
between-group component as a proportion of the total variance. This is the intraclass 
correlation coefficient and is given by 
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, 2 
O B + °W 

where o 2 B is the between-group component and a 2 w the within-group component. 
(It is customary to use the symbol t for the intraclass correlation of phenotypic values 
in order to avoid confusion with other correlations for which the symbol r is used.) 
The between-group component expresses the amount of variation that is common 
to members of the same group and it can equally well be referred to as the covariance 
of members of the groups. In the case of the resemblance between offspring and 
parents, the grouping of the observations is into pairs rather than groups; one parent, 
or the mean of two parents, paired with one offspring or the mean of several off¬ 
spring. The between-pair component of variance is then meaningful only if the paren¬ 
tal values and the offspring values have the same variance, which they often do not. 
The covariance of offspring with parents is therefore calculated from the sum of 
cross-products, and the degree of resemblance is expressed as the regression of off¬ 
spring on parents. The reason why the correlation is often inappropriate will become 
apparent later. The regression is given by 


^OP 


COVqp 

4 


where cov OP is the covariance of offspring and parents, and 4 is the variance of 
parents. 

Thus, the covariance of related individuals is the new property of the population 
that we have to deduce in seeking the cause of resemblance between relatives, whether 
sibs or offspring with parents. The covariance, being simply a portion of the total 
phenotypic variance, is composed of the causal components described in the last 
chapter, but in amounts and proportions differing according to the sort of relation¬ 
ship. By finding out how the causal components contribute to the covariance, we 
shall see how an observed covariance can be used to estimate the causal components 
of which it is composed. 

The commonest and most useful relationships are offspring with parents, half sibs, 
full sibs, and (in human studies) twins. The covariances in these relationships will 
be explained fully, and those of other relationships summarized afterwards. Twins 
have their special problems which will be dealt with in the next chapter. 

Both genetic and environmental sources of variance contribute to the covariance 
of relatives, the covariance of phenotypic values being the sum of the genetic and 
environmental covariances. The genetic covariances will be described first, with 
the regressions or correlations that they give rise to, and the environmental causes 
of resemblance will be commented on later. 


Genetic covariance 

Our object now is to deduce from theoretical considerations the covariance of relatives 
arising from genetic causes, neglecting for the time being any non-genetic causes 
of resemblance that there may be. This means that we have to deduce the covariance 
of the genotypic values of the related individuals. The population will be assumed 
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to be in Hardy—Weinberg equilibrium, all parents mating at random with respect 
to the character under consideration. The effects of assortative mating will be con¬ 
sidered in the next chapter. Any variance arising from epistatic interaction between 
loci will at first be neglected, its effects being described briefly later. For each rela¬ 
tionship, two ways of deducing the covariance will be described, the first being more 
concise and the second more explicit. 

Offspring and one parent 

The covariance to be deduced is that of the genotypic values of individuals with 
the mean genotypic values of their offspring produced by mating at random in the 
population. If values are expressed as deviations from the population mean, then 
the mean value of the offspring is by definition half the breeding value of the parent, 
as explained in Chapter 7. Therefore the covariance to be computed is that of an 
individual’s genotypic value with half its breeding value, i.e., the covariance of G 
with j A. Since G = A + D (D being the dominance deviation), the covariance is 
that of (A + D ) with \A. Multiplying these terms together and summing over all 
parents gives 

Sum of cross-products = L^AiA + D) 

- i LA 2 + k E .AD 

Dividing both sides by the number of parents gives the covariance as 2 V A + 

2 cov ^d- It was shown in the previous chapter that cov^ is zero, so 

cov OP — 2 Va ... [9.1] 

The genetic covariance of offspring and one parent is therefore half the additive 
genetic variance of the parents. 

The second, more explicit, way of deriving the covariance is by consideration 
of the effects of single loci. This will be done by reference to a locus with two alleles 
but the conclusions are equally valid for loci with any number of alleles. Table 9.1 
gives the genotypes of the parents, their frequencies in the population, and their 
genotypic values expressed as deviations from the population mean (from Table 7.3). 
The right-hand column gives the mean genotypic values of the offspring, which are 
half the breeding values of the parents as given in Table 7.3. The covariance of 
offspring and parent is then the mean cross-product, and is obtained by multiplying 
together the three columns — frequency X genotypic value of parent X genotypic 
value of offspring — and summing over the three genotypes of the parents. After 
collecting together the terms in a 2 and the terms in ad we obtain 


Table 9.1 

Parents 



Offspring 

Genotype 

Frequency 

Genotypic value 

Mean genotypic value 

A.A, 

P 2 

2q(a - qd) 

qot 

A|A 2 

2 pq 

{q — p)a + Ipqd 

j(q - p)a 

a 2 a 2 

q 2 

— 2p{a + pd ) 

—pa 
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cov OP = pqa 2 (p 2 + 2pq + q 2 ) + lp 2 q 2 ad(- q + q - p + p) 

= pqa 2 

= W A 

since, from equation [8.3a], V A = Ipqa 2 . Summing over all loci we again reach 
the conclusion that the covariance of offspring and one parent is equal to half the 
additive variance. 

The regression of offspring on one parent is got by dividing the covariance by 
* the variance of the parents, which is the phenotypic variance of the population. Thus 
the regression is 


, , V A 

bop = i -rr- • - P-2] 

V P 

The covariance was deduced above by considering the mean value of the offspring 
of each parent, without specifying the number of offspring on which the mean is 
based. In fact, the covariance is the same whatever the number of offspring, even 
if only one is used, for the following reason. The mean of n offspring is (l/n)LO, 
where SO is the sum of the values of the offspring. The covariance of one parent 
with the mean of n offspring is cov(P,(l/n)E O) = (1/«)E cov(P, O) = cov(P, O), 
which is the covariance of parents with any one offspring. This conclusion is applic¬ 
able to relatives of any kind. In general, therefore, the covariance of any individual 
with the mean value of a number of relatives is equal to its covariance with any 
one of those relatives. The regression of offspring on parents is also unaffected by 
the number of offspring used, because the variance of offspring does not enter into 
the calculation of the regression. 

Offspring and mid-parent 

The covariance of the mean of the offspring and the mean of both parents (com¬ 
monly called the ‘mid-parent’) may be deduced in the following way. Let O be the 
mean of the offspring, and P and P' be the values of the two parents. The mid¬ 
parent value is P = j (P + P'). The sum of cross-products is LOP = j (LOP + 
LOP'), and the covariance is cov OP = j (cov OP + cov OP ). If P and P' have the 
same variance, then cov OP = cov op < and consequently 

cov OP = cov 0 p = tV a ... [9.3] 

Thus, provided the two sexes have equal variances, the covariance of offspring and 
mid-parent is the same as that of offspring with one parent, which we have seen 
is equal to half the additive variance. 

The longer method of demonstrating the covariance of offspring with mid-parent 
is rather laborious, but it must be given since it will be needed for arriving at the 
covariance of full sibs. We shall, however, omit some of the steps of algebraic reduc¬ 
tion. A table (Table 9.2) is made in the same manner as for offspring and one parent, 
but now we have to tabulate types of mating and their frequencies, instead of single 
parents. Against each type of mating we put the mean genotypic value of the two 
parents, i.e., the mid-parent value; then the proportions of the three genotypes among 
the progeny, and the mean genotypic value of the progeny. The working is made 
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easier by writing the genotypic values in terms of a and d instead of as deviations 
from the population mean. The last two columns of the table give the product of 
progeny-mean x mid-parent, and the square of the progeny for later use. To get 
the covariance of progeny-mean and mid-parent value, we take the product of 
progeny-mean X mid-parent and multiply it by the frequency of the mating type, 
and then sum over mating types. This gives the mean product ( MP ) from which 
we have to deduct a correction for the population mean, since values are not here 
expressed as deviations from the mean. The correction is simply the square of the 
po'pulation mean ( M 2 ) since the means of parents and of progeny are equal. Both 
MP and M 2 contain terms in a 2 , in ad, and in d 2 . By collecting together these terms 
and simplifying a little we obtain 

MP = a 2 \p 3 (p + q) + q*(p+q)] + ladpqip 2 - q 2 ) + d 2 pqip 2 +2pq+q 2 ) 

M 2 = a 2 (p 2 — 2 pq + q 2 ) + 4 adpqip — q) + 4 d 2 p 2 q 1 

Then, 

cov 0 p = MP — M 2 

= a 2 pq — ladpqip — q) + d 2 pq{p — q) 2 
= pq{a + d(q - p)] 2 
= pqa 2 

= iV A 

when summed over all loci. 

Though the covariance of offspring with the mean of both parents is the same 
as the covariance with a single parent, the degree of resemblance is not the same. 
The regression of offspring on mid-parent values is b =COV OP !a\, where a \ is the 
variance of mid-parent values. If the variances of the two sexes are equal, then o\ 
= 2 V P because, in general, the variance of the mean of n individuals is one nth of 
the variance of single individuals. The regression of offspring on mid-parent values 
is therefore 



which is twice the regression on single parents. As with single parents, the number 
of offspring used does not affect the covariance or the regression. 

The regression of offspring on parents is a useful measure of the degree of 
resemblance because it is simply related to the causal components of variance. The 
correlation between offspring and parents, however, does not have this useful feature. 
The correlation is calculated as cov 0 p/a 0 a P , where a Q and <r P are the square roots 
of the variances of offspring and parents respectively, whether single or the mean 
of more than one. So the correlation is affected by the number of offspring as well 
as by the number of parents. If there is only one offspring, the correlation with a 
single parent is the same as the regression, but under all other circumstances it is 
different. The correlation of one offspring with mid-parent values is (V i)V A IVp. 
When there are more than one offspring, the correlation depends on the variance 
of the observed means of the offspring and has no simple relationship with the causal 
components of variance. 
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Half sibs 

Half sibs are individuals that have one parent in common and the other parent dif¬ 
ferent. A group of half sibs is therefore the progeny of one individual mated to a 
random group of the other sex and having one offspring by each mate. Thus die 
mean genotypic value of the group of half sibs is by definition half the breeding 
value of the common parent. The covariance is the variance of the true means of 
the half-sib groups, and is therefore the variance of half the breeding values of the 
commo/i parents, which is a quarter of the additive variance: 

cov (HS) = V± A = ±v a ... [9.5] 

This covariance also can be demonstrated by the longer method, from the values 
already given in Table 9.1. The covariance is the variance of the means of the groups 
of offspring listed in the right-hand column. Squaring the offspring values and 
multiplying by their frequencies gives: 

Variance of means of half-sib families 

= P 2 q 2 a 2 + 2 pq.^iq — p) 2 a 2 + q 2 p 2 a 2 
= pqa 2 \pq + Kq - P) 2 + P4l 
= pqu 2 [%p + q) 2 ] 

= 2 pqoc 2 

Therefore, since 2pqa 2 = V A (from equation [8.3d]), 

cov (HS) = 4 V A 

summation being made over all loci. 

The degree of resemblance between sibs is expressed as the intraclass correla¬ 
tion, which is the between-group variance, i.e., the covariance, as a proportion of 
the total variance. So the correlation of half sibs is 


Full sibs 

The covariance of full sibs is less simple than those of the relationships so far con¬ 
sidered because the dominance variance contributes to it. Consider first the covariance 
due to the additive variance alone. Full sibs have both parents in common and the 
mean genotypic value of a group of full sibs is then equal to the mean breeding value 
of the two parents. Let A and A ' be the breeding values of the two parents. Then 
the covariance is the variance of \ {A + A') which is £ (V A + V A >) = ? V A if the 
additive variance is the same in the two sexes. Now consider the contribution of 
dominance. It is easier to think of the covariance being calculated from the sum 
of cross-products in pairs of sibs taken at random. Let the parents have genotypes 
A,A 2 and A 3 A 4 . There are then four genotypes among the progeny, A,A 3 , A,A 4 , 
A 2 A 3 , and A 2 A 4 , each with a frequency of l Let the first sib chosen have any one 
of these genotypes. Then the probability that the second sib has the same genotype 
is i Thus, one-quarter of all sib-pairs have the same genotype and consequently 
the same dominance deviation, D . For these pairs having the same dominance devia¬ 
tion, the cross-product of the dominance deviations is D 2 \ other pairs, with different 
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dominance deviations, have a mean cross-product of zero. Over all pairs, therefore, 
the sum of cross-product is \ ED 2 , and so the mean cross-product is | V D . This is 
the covariance due to dominance deviations, and it adds to the covariance due to 
breeding values. The genetic covariance of full sibs is therefore 

cov (FS) = k V A + i V D ... [9.7] 

The second way of deriving the covariance of full sibs comes from Table 9.2 with 

little additional work. The covariance is the variance of the means of families. The 

* 

right-hand column shows the squares of the progeny means, and it will be seen that 
these are all exactly the same as the products of progeny mean X mid-parent, ex¬ 
cept for the two entries in the middle involving terms in d 2 . The mean square (MS) 
can therefore be got from the mean product ( MP ) already calculated; thus 

MS — MP + d 2 lp 2 q 2 — i d 2 Ap 2 c^ 

= MP + d 2 p 2 q 2 

The correction for the mean is the same as before, so we have 

cov (FS) = cov OP + d 2 p 2 q 2 
= pqot 2 + d 2 p 2 q 2 

Since 2 pqot 2 = V A (from equation [8.3a]) and Ad 2 p 2 q 2 = V D (from equation 
[8.4]) the covariance of full sibs is 


cov (FS) = k V A + i V D 


summing over all loci. 

The correlation of full sibs is 


t = 


Wa + 
V P 


. . . [9.8] 


In principle the difference between the covariances of full sibs and of half sibs 
provides a way of estimating the dominance variance, since cov (FS) — 2cov (HS) = 
i V D . In practice, however, this can be done only if there are no environmental 
contributions to the phenotypic covariances. 


Twins 

The genetic covariances of twins are very simple. Dizygotic (fraternal) twins are 
related as full sibs and their genetic covariance is that of full sibs. Monozygotic (iden¬ 
tical) twins have identical genotypes, so there is no genetic variance within pairs 
and the whole of the genetic variance appears in the between-pair component. The 
genetic covariance is therefore 

cov (MZ) = V G ... [9.9] 


General 

From the relationships explained above, it will have been seen that the covariance 
is made up of simple fractions of the causal components of variance, V A and V D , 
the fractions in the cases dealt with being j or i These fractions, or coefficients, 
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are related in a simple manner to the coancestries of the relatives and their parents, 
so that the covariance of any sort of relatives can be easily deduced from a con¬ 
sideration of the appropriate pedigree. Let r be the fraction of the additive genetic 
variance, and u that of the dominance variance, appearing in the covariance. Then 
the generalized covariance for any sort of relationship is 

cov = rV A + uV D ...[9.10] 

Let P and Q be two individuals representing the relationship whose covariance is 
required, and let A, B and C, D be their parents respectively, as shown in Fig. 5.4. 
Then, letting / stand for the coancestry as explained in Chapter 5, the values of r 
and u are obtained as 

r = 2/pq ...[9.11] 

u ~ Ag/bd + /ao/bc [9.12] 

(see Crow and Kimura, 1970, p. 134). Table 9.3 gives the values of r and u, sum¬ 
marizing the relationships already described and adding some other, more distant, 
relationships. Equations [9.11] and [9.12] apply to a random-breeding population. 
Inbreeding of the parents, however, does not affect them, but if the relatives 
themselves are inbred then 


r = 2/pqA/(1 + F P )(1 + F q ) 

(Crow and Kimura, 1970, p. 138). 

The coefficient r of the additive variance is sometimes called the coefficient of 
relationship, or the theoretical correlation, between the relatives in question. It is 
the correlation between their breeding values, and it represents the correlation that 
would be found if all the phenotypic variance were additive genetic. We shall return 
to this point when considering the estimation of the heritability in the next chapter. 
The coefficient u of the dominance variance represents the probability of the relatives 
having the same genotype through identity by descent. It is zero unless the related 
individuals have paths of coancestry through both of their respective parents, as have 
full sibs and double first cousins. 


Table 9.3 Coefficients of the variance components in the covariances of relatives. 




Coefficient 



r 

u 

Relationship 


(of V A ) 

(of V D ) 

MZ twins 


1 

1 

First-degree 

Offspring: parent 

1 

2 

0 


Full sib 

1 

2 

1 

4 

Second-degree 

Half sib 




Offspring: grandparent i 

Uncle (aunt): nephew (niece) J 

4 

0 


Double first cousin 

l 

4 

l 

16 

Third-degree 

Offspring: great-grandparent 1 

1 

0 


Single first cousin 1 
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Epistatic interaction 

The variance arising from epistatic interaction between loci contributes small frac¬ 
tions to the covariances of relatives. In Chapter 8 it was noted that the interaction 
variance V } is subdivided into components according to the number of interacting 
loci, and according to whether the interaction is between breeding values or 
dominance deviations. The generalized covariance of any sort of relatives is as follows 
when the interaction components are included (for details, see Kempthorne, 
1955a, b): 

« 

cov = rV A + uV D + r 2 ^ + ruV^ + u 2 V DD 

+ r 3 VAAA + + ru Z V ADD + u 3 V DDD ... [9.13] 

etc. 1 

Table 9.4 gives the coefficients of the variance components in the covariances with 
two-factor interactions included. The offspring—parent covariance refers equally to 
one parent and to mid-parent values. The conclusions that come from consideration 
of the interaction components are the following. First, only small fractions of the 
interaction components contribute to any covariance, the most being the con¬ 
tributions of interactions between more than two loci are even smaller. Second, inter¬ 
action components involving dominance variance do not contribute unless the 
dominance variance itself contributes. Third, and this is the most important point, 
the interactions of breeding values, VVaaa, etc., contribute to all covariances 
of relatives. Any estimate of V A made from half-sib correlations will contain also 
4 Vaa + fft VAAA’ etc.; and any estimate of V D obtained from a full-sib correlation 
will contain also portions of the A X D and D X D interaction components. It was 
noted in Chapter 7 that the two definitions of breeding value given there are not 
equivalent if there is interaction between loci. We can now see how this comes about. 
Defined in terms of the measured values of progeny — the practical definition — 
breeding value includes additive X additive interaction deviations as well as the 
average effects of the genes carried by the parents; whereas, defined in terms of 
the average effects of genes — the theoretical definition — it does not. 

From the coefficients in Table 9.4 one can see how in principle the interaction 
components can be estimated. For example, cov 0 p — 2cov (HS) = £ V^. To 
estimate the interaction components, however, requires complex experiments of great 
precision, providing comparisons of the covariances of many different sorts of 


Table 9.4 Covariances of relatives including the contributions of two-factor interactions. 




Variance components and the coefficients of 
their contributions 

Relatives 


Va 

y D v M 

Vad Vdd 

Offspring—parent: 

cov OP = 

1 

2 

i 

- 4 

— — 

Half sibs: 

C0V (HS) = 

1 

4 

1 

- 16 

— — 

Full sibs: 

COV (FS) = 

1 

2 

1 1 

4 4 

1 1 

8 16 

General: 

COV = 

r 

ii r 2 

ru u 2 
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relatives free of environmental covariance. This is hardly practicable with animals 
but it can be done with plants: see, for example, Pooni et al. (1978), and Chi et 
al. (1969), who found negligible interaction variances in seven characters of maize. 

Effects of linkage. In the derivations of all the covariances given above, the effects 
of linkage have been ignored, the summation over loci carrying the implicit assump¬ 
tion that all the loci segregate independently. Linkage does not affect the covariances 
if the population is in linkage equilibrium, unless there is epistatic interaction be¬ 
tween the loci; the full- and half-sib covariances are then increased by linkage be¬ 
tween the interacting loci. For details see Cockerham (1956) and Weir, Cockerham, 
and Reynolds (1980). 

Environmental covariance 

Genetic causes are not the only reasons for resemblance between relatives; there 
are also environmental circumstances that tend to make relatives resemble each other; 
some sorts of relatives more than others. If members of a family are reared together, 
as with human families or litters of pigs or mice, they share a common environ¬ 
ment. This means that some environmental circumstances that cause differences be¬ 
tween unrelated individuals are not a cause of difference between members of the 
same family. In other words, there is a component of environmental variance that 
contributes to the variance between means of families but not to the variance within 
the families, and it therefore contributes to the covariance of the related individuals. 
This between-group environmental component, for which we shall use the symbol 
V Ec , is usually called the common environment, a term that seems more appropriate 
when we think of the component as a cause of similarity between members of a 
group than when we think of it as a cause of difference between members of dif¬ 
ferent groups. The remainder of the environmental variance, which we shall denote 
by V Ew , arises from causes of difference that are unconnected with whether the in¬ 
dividuals are related or not. It therefore appears in the within-group component of 
variance, but does not contribute to the between-group component, which is the 
variance of the true means of the groups. In considerations of the resemblance be¬ 
tween relatives, therefore, the environmental variance must be divided into two 
components: 

Ve=V Ec +V Ew ...[9.14] 

one of the components, Vec, contributing to the covariance of the related 
individuals. 

The sources of the common environmental variance are many and varied, and 
arise from environmental factors such as nutrition, climatic conditions or, in man, 
cultural influences. Whenever families differ in respect of these factors there will 
be, or may be, environmentally caused differences between the means of families, 
which appear in the covariance as the V Ec component. What we designate as the 
V Ec component depends on the way in which individuals are grouped when we 
estimate the observational components of phenotypic variance. Whatever the form 
of the analysis, the part of the variance between the means of groups that can be 
ascribed to environmental causes is called the V Ec component. The nature of this 
component thus depends on the form of the analysis applied. If the groups in the 
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analysis are full-sib families then the Vec component represents environmental 
causes of similarity between full sibs; if the groups are half sibs it represents causes 
of similarity between half sibs. And in parent—offspring relationships a comparable 
covariance term represents environmental causes of resemblance between offspring 
and parent. Thus, whenever we measure a phenotypic covariance with the object 
of using it to estimate a causal component of variance, we have to decide whether 
it includes an appreciable component due to common environment, and this is often 
a matter of judgement based on a biological understanding of the organism and the 
character. In experiments, much of the Vec component can often be eliminated by 
suitable design. For example, members of the same family need not always be reared 
in the same vial, cage, or plot: they can be randomized over the rearing environments. 
Or, by dividing families into two or more groups, the Vec component can be 
measured and deducted from the covariance. 

Maternal effects are a frequent, and often troublesome, source of environmental 
resemblance, particularly with mammals. The young are subject to a maternal 
environment during the first stages of their life, and this influences the phenotypic 
values of many metric characters even when measured on the adult, causing off¬ 
spring of the same mother to resemble each other. Two sorts of maternal effect need 
to be distinguished. First, the phenotypic value of the mother for the character in 
question may influence the value of the offspring for the same character. For exam¬ 
ple, large mice give more milk than small mice and consequently their young grow 
better. This leads to an environmentally caused resemblance between the weight of 
the offspring and the weight of their mother. Furthermore, offspring of the same 
mother resemble each other in weight because they have shared the same milk sup¬ 
ply. This sort of maternal effect therefore contributes an environmental component 
to the covariance of offspring with mothers, and to the covariance of full sibs or 
maternal half sibs. The second sort of maternal effect causes resemblance between 
offspring of the same mother, but not between the offspring and their mother. This 
arises when the character in the mother that gives rise to the maternal effect is not 
the character whose covariance is being studied. For example, the growth of the 
tails of young mice is influenced by the temperature in the nest. Mothers differ in 
the assiduity with which they nurse their young, and consequently there are differences 
in nest temperature between families. This produces an environmental component 
in the covariance of sibs in respect of tail length. But the nest temperature is not 
related to the mother’s tail length, so there is no environmental covariance of off¬ 
spring and mothers in respect of tail length. The variation among offspring due to 
a maternal effect results from variation among the mothers in the character that gives 
rise to the maternal effect, such as milk-yield. The maternal character is, to a greater 
or lesser degree, determined by the mother’s genotype. Therefore the environmental 
variance V Ec seen in the offspring is to some extent the consequence of genetic 
variation of some other character in the mothers. The resemblance between relatives 
becomes very complicated when the genetic basis of a maternal effect is taken into 
account. For details, see Willham (1963), Thompson (1976). 

Relatives of all sorts may be subject to an environmental source of resemblance. 
In what follows, however, we shall make the simplification of disregarding the V Ec 
component for all relatives except full sibs. The common maternal environment of 
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full sibs is often the most troublesome source of environmental resemblance to over¬ 
come by experimental design. Consequently, a Ve c component contributes more 
often and in greater amount to the covariance of full sibs than to that of any other 
sort of relative. 

Competition . Brief mention must be made of a way by which resemblance between 
relatives can be reduced, instead of increased, for environmental reasons. This oc¬ 
curs when members of the same family compete for limited resources. Suppose, 
for example, that sib-families of animals are reared with each family in a separate 
pen, and that all pens are given the same fixed amount of food, growth rate being 
the character of interest. There would then be little or no variation in growth rate 
between families, and the covariance would consequently be reduced. There would, 
however, still be variation within families; indeed, there might be more variation 
than with unrestricted feeding because the competition is an additional source of 
variation. The intraclass correlation would therefore be reduced by the competition 
for Fixed resources. The correlation could even be negative, because if one individual 
gets more to eat, another must of necessity get less. Competition is an important 
factor in plants, often making sib-correlations largely meaningless, particularly with 
characters related to yield. 

Phenotypic resemblance 

The covariance of phenotypic values is the sum of the covariances arising from genetic 
and from environmental causes. Thus by putting together the conclusions of the two 
preceding sections we arrive at the phenotypic covariances summarized in Table 
9.5. (It will be remembered that some possible sources of environmental covariance 
are being neglected, particularly in offspring-parent relationships involving the 
mother.) In all these relationships except that of full sibs, the covariance is either 
a half or a quarter of the additive genetic variance. By observing the phenotypic 
covariance of relatives, we can thus eliminate the amount of additive genetic variance. 


Table 9.5 Phenotypic resemblance between relatives 


Relatives 

Covariance 

Regression (b) 
or correlation (t) 

Offspring and one parent 

W A 

. v A 

b = 

Vp 

Offspring and mid-parent 

W A 

Sr 

It 

Half sibs 

\V A 

II 

• m - 

Full sibs 

\V A + 4 V D + Vec 

J _ 

\V A + \V D + V EC 

t - 

Vp 


< 


Problems 


161 


Similarly, the regression or correlation provides a means of estimating die propor¬ 
tionate amount of additive genetic variance, V A /V P , which is the heritability, and 
this is the chief use of measurements of the degree of resemblance between relatives. 
The method of estimating the heritability will be considered more fully in the next 
chapter. 

Problems 

9.1 What is the coefficient of relationship, r, between the children of a pair of 
MZ twins married to unrelated spouses? 


[Solution 8] 

9.2 Problems 7.4, 7.8 and 8.2(3) dealt with the effects of two genes on the pigmen¬ 
tation of mouse hairs in a specified population. Suppose that the measurements of 
individuals were subject to environmental variance amounting to one-third of the 
total genetic variance, but that there were no environmental differences between fam¬ 
ily means. What would then be the phenotypic resemblance between the following 
relatives in this population: (1) offspring and mid-parent, (2) offspring and one parent, 
(3) full sibs, (4) half sibs, (5) double first cousins? 


[Solution 18] 

9.3 The following correlations of total finger-ridge counts have been reported. Are 
they consistent with each other? What assumptions have to be made in comparing 
the correlations? 

Midparent—child: r = 0.69 ± 0.03 

Father—child: r = 0.50 ± 0.04 

Mother—child: r = 0.49 ± 0.04 

Data from Holt, S. B. (1956/57) Acta genet., 6, 473—6. 


[Solution 28] 

9.4 The estimates below refer to the litter size of mice under different experimental 
procedures. Litter size is the number of young born and is a character of the mother. 
In one case the litters by which the litter size of the mother was measured were 
all standardized at birth to 8 young by removing young in excess of 8 or fostering 
young from other litters to make up to 8. In the other case the litters were not 
manipulated and most of the young born were reared to weaning. Suggest reasons, 
other than sampling error and strain differences, that might account for the differences 
between the estimates. In both cases the full sibs were mothers bom in the same litter. 



Litters 

Utters 


standardized 

not standardized 

Full-sib correlation 

0.055 

0.107 

Daughter—dam regression 

0.045 

-0.028 
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Data from Eisen, E. J. (1978) Genetics, 88, 781—811; Falconer, D. S. (1965) pp. 
763—74 in Geerts, S. J. (ed.), Genetics Today. Proc. XI Internat, Congr. Genetics, 
Vol. 3. Pergamon, Oxford. 

[Solution 38] 



HERIT ABILITY 


The heritability of a metric character is one of its most important properties. It 
expresses, as we have seen, the proportion of the total variance that is attributable 
to the average effects of genes, and this is what determines the degree of resemblance 
between relatives. But the most important function of the heritability in the genetic 
study of metric characters has not yet been mentioned, namely its predictive role, 
expressing the reliability of the phenotypic value as a guide to the breeding value. 
Only the phenotypic values of individuals can be directly measured, but it is the 
breeding value that determines their influence on the next generation. Therefore 
if the breeder or experimenter chooses individuals to be parents according to their 
phenotypic values, his success in changing the characteristics of the population can 
be predicted only from a knowledge of the degree of correspondence between 
phenotypic values and breeding values. This degree of correspondence is measured 
by the heritability, as the following considerations will show. 

The heritability is defined as the ratio of additive genetic variance to phenotypic 
variance: 


h 2 


Va 

V P 


... [ 10 . 1 ] 


(The customary symbol h 2 stands for the heritability itself and not for its square. 
The symbol derives from Wright’s (1921) terminology, where h stands for the cor¬ 
responding ratio of standard deviations.) An equivalent meaning of the heritability 
is the regression of breeding value on phenotypic value: 

k 2 =b AP ...[10 .2] 

The equivalence of these meanings can be seen from reasoning similar to that by 
which the genetic covariance of offspring and one parent was derived in the previous 
chapter. If we split the phenotypic value into breeding value and a remainder ( R ) 
consisting of the environmental, dominance, and interaction deviations, then P = 
A + R. Since A and R are uncorrelated, cov^p = V A and so b AP = V A /V P = h 2 . 

We may note also that the correlation between breeding values and phenotypic 
values, r AP , is equal to the square-root of the heritability. This follows from the 
general relationship between correlation and regression coefficients, which gives 


- u °p 

r AP — Dap 

°a 



... [10.3] 
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By regarding the heritability as the regression of breeding value on phenotypic 
value we see that an individual’s estimated breeding value is the product of its 
phenotypic value and the heritability 

•^(expected) — h P ... [10.4] 

breeding values and phenotypic values both being reckoned as deviations from the 
population mean. In other words, the heritability expresses the reliability of the 
phenotypic^value as a guide to the breeding value, or the degree of correspondence 
between phenotypic value and breeding value. For this reason the heritability enters 
into almost every formula connected with breeding methods, and many practical 
decisions about procedure depend on its magnitude. These matters, however, will 
be considered in the next chapters; here we are concerned only to point out that 
the determination of the heritability is one of the first objectives in the genetic study 
of a metric character. 

It is important to realize that the heritability is a property not only of a character 
but also of the population, of the environmental circumstance to which the individuals 
are subjected, and of the way in which the phenotype is measured. Since the value 
of the heritability depends on the magnitude of all the components of variance, a 
change in any one of these will affect it. All the genetic components are influenced 
by gene frequencies and may therefore differ from one population to another, accord¬ 
ing to the past history of the population. In particular, small populations maintained 
long enough for an appreciable amount of fixation to have taken place are expected 
to show lower heritabilities than large populations. The environmental variance is 
dependent on the conditions of culture or management: more variable conditions 
reduce the heritability; more uniform conditions increase it. And, finally, if the 
phenotype is the mean of two or more measurements the heritability will differ ac¬ 
cording to the number of measurements and will differ from that of a single measure¬ 
ment, for the reasons explained in connection with repeatability in Chapter 8. So, 
whenever a value is stated for the heritability of a given character it must be 
understood to refer to a particular population under particular conditions. Values 
found in other populations under other circumstances will be more or less the same 
according to whether the structure of the population and the environmental condi¬ 
tions are more or less alike. 

Very many determinations of heritabilities have been made for a great variety of 
characters in animals and plants. Some examples are given in Table 10.1. 
Heritabilities cannot easily be estimated with any great precision, and most estimates 
have rather large standard errors. Different estimates for the same character in the 
same organism show a wide range of variation, some of which may reflect real dif¬ 
ferences between populations or the conditions under which they are studied. Never¬ 
theless, within the range of sampling errors, estimates tend to be similar in different 
populations. Because of the large sampling errors, the estimates in Table 10.1 are 
given to the nearest 5 per cent. Despite the lack of precision, it is very clear that 
heritabilities differ greatly according to the character. There is, moreover, some 
connection between the magnitude of the heritability and the nature of the character. 
This can be seen in Table 10.1. On the whole, the characters with the lowest 
heritabilities are those most closely connected with reproductive fitness, while the 
characters with the highest heritabilities are those that might be judged on biological 
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Table 10.1 Approximate values of the heritability of various characters in various animal 
species. The estimates are rounded to the nearest 5 per cent; their standard errors range 
from about 2 per cent to about 10 per cent. 



h\%) 

Ref. 

Man 

Stature 

65 

0) 

Serum immunoglobulin (IgG) level 

45 

(2) 

Cattle 

Body weight (adult) 

65 

(3) 

Butterfat, % 

40 

(4) 

Milk-yield 

35 

(4) 

Pigs 

Back-fat thickness 

70 

(5) 

Efficiency of food conversion 

50 

(5) 

Weight gain per day 

40 

(5) 

Litter size 

5 

(6) 

Poultry 

Body weight (at 32 wks) 

55 

(7) 

Egg weight (at 32 wks) 

50 

(7) 

Egg production (to 72 wks) 

10 

(7) 

Mice 

Tail length (at 6 wks) 

40 

(8) 

Body weight (at 6 wks) 

35 

(8) 

Litter size (1st litters) 

20 

(9) 

Drosophila melanogaster 

Abdominal bristle number 

50 

(10) 

Body size 

40 

(11) 

Ovary size 

30 

(12) 

Egg production 

20 

(11) 


(1) West African population. Roberts, Billewicz, and McGregor (1978). 

(2) US whites, Grundbacher (1974). 

(3) Beef cattle; average of many estimates. Preston and Willis (1970). 

(4) British Friesians, 1st lactations. Barker and Robertson (1966). 

(5) British Large White. Smith, King and Gilbert (1962). 

(6) British Large White. Strang and Smith (1979). 

(7) White Leghorn strain-crosses. Emsley, Dickerson, and Kashyap (1977). 

(8) Rutledge, Eisen, and Legates (1973). 

(9) Falconer (19656). 

(10) Clayton, Morris, and Robertson (1957). 

(11) Robertson (19576). 

(12) Robertson (1957a). 


grounds to be the least important as determinants of natural fitness. This relation¬ 
ship has been well substantiated by extensive surveys of the heritabilities of different 
characters in Drosophila (Roff and Mousseau, 1987) and in wild populations of a 
great variety of species (Mousseau and Roff, 1987). The reasons why different sorts 
of character should have different heritabilities will be considered in Chapter 20. 

Some care is needed in applying the concept of heritability to plants. When defined 
as h 2 = V A /V P , the variances are those of individual values. Individual values of 
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plants, particularly of their yields, are often not available or, if available, are rendered 
largely meaningless by competition, which was mentioned as an environmental fac¬ 
tor in the previous chapter. Yields are usually expressed per unit area of plot in 
which the plants are grown. The unit of measurement is therefore the plot yield, 
not the individual yield. If the individuals in a plot are members of one family — 
full or half sibs — the ‘heritability’ is the heritability of differences between families, 
the meaning of which will be explained in Chapter 13. The rest of this chapter refers 

only to the heritability of individual values. 

♦ 

Estimation of heritability 

The heritability is estimated from the degree of resemblance between relatives. Table 
10.2 shows again the composition of the phenotypic covariances derived in the 
previous chapter. The right-hand column gives the regression or correlation expressed 
in terms of the heritability from which it can be seen that with any relationship 

h 2 = b/r or t/r ... [10.5] 

where r is the coefficient of the additive variance in the covariance, or the ‘theoretical’ 
correlation. Thus, when expressed in terms of the correlation (or regression) be¬ 
tween relatives, the heritability is the observed correlation as a proportion of the 
correlation that would be found if the character were completely inherited, i.e., if 
all the variance were additive genetic. 

The choice of what sort of relatives to use for the estimation of the heritability 
depends on the circumstances. In addition to the practical matter of which sorts of 
relatives are in fact available, there are two points to consider: precision and bias. 
In general, the closer the relationship, the more precise is the estimate. The reason 
for this is that the observed regression or correlation must be multiplied by a larger 
factor {Hr) with more distant relatives, and the standard error of the regression or 
correlation must be multiplied by the same factor to give the standard error of the 
estimated heritability. The statistical precision will be considered more fully later 
in this chapter. Bias in the estimate of the heritability is usually a more important 
consideration than precision. It is introduced by environmental sources of covariance 
and, in the case of full sibs, by dominance. From considerations of the biology of 
the character and the experimental design, we have to decide which covariance is 


Table 10.2 


Relatives 

Covariance* 

Regression (b) or 
correlation (t) 

Offspring and one parent 

2 VA 

b = kh 2 

Offspring and mid-parent 

\y A 

b = h 2 

Half sibs 

iKi 

t = \h 2 

Full sibs 

2 v A + \ V D + V Ec 

t > 2 h 2 


*The contributions of epistatic interactions are ignored, and so are the possible environmental 
contributions to relatives other than full sibs. 
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least likely to be augmented by an environmental component, a matter already discus¬ 
sed in the last chapter. Generally speaking, the half-sib correlation and the regres¬ 
sion of offspring on father are the most reliable from this point of view. The regression 
of offspring on mother is sometimes liable to give too high an estimate on account 
of maternal effects, as it would, for example, with body size in most mammals. 
(Example 10.3 illustrates the bias due to a maternal effect.) The full-sib correlation, 
which is the only relationship for which an environmental component of covariance 
is shown in the table, is the least reliable of all. The component due to common 
environment is often present in large amount and is difficult to overcome by ex¬ 
perimental design; and the full-sib covariance is further augmented by the dominance 
variance. The full-sib correlation can therefore seldom do more than set an upper 
limit to the heritabiiity. 

Example 10.1 The heritabiiity of abdominal bristle number in Drosophila melanogaster 
has been determined by three different methods, applied to the same population (Clayton, 
Morris, and Robertson, 1957), with the following results: 


Method of estimation 

Heritabiiity 

Offspring—parent regression 

0.51 

± 0.07 

Half-sib correlation 

0.48 

± 0.11 

Full-sib correlation 

0.53 

± 0.07 

Combined estimate 

0.52 



The estimates obtained by the three methods are in very satisfactory agreement. In this 
case, the character — bristle number — is free of complications arising from maternal 
effects and common environment. 

Let us now consider briefly some technical matters concerning the translation of 
observational data into estimates of heritabiiity. For the moment it will be assumed 
that all observations are made on a random-mating population with no selection of 
the parents. Later, the effects of assortative mating and of selection will be described. 

Offspring—parent regression 

The estimation of the heritabiiity from the regression of offspring on parents is com¬ 
paratively straightforward. The data are obtained in the form of measurements of 
parents — one or the mean of both — and the mean of their offspring. The covariance 
is then computed from the cross-products of the paired values. The following exam¬ 
ple illustrates the regression on mid-parent values. 

Example 10.2. Figure 10.1 illustrates the regression of offspring on mid-parent values 
for wing length in Drosophila melanogaster (Reeve and Robertson, 1953). There are 
37 pairs of parents and a mean of 2.73 offspring were measured from each pair of parents. 
The parents were mated assortatively, with the result that the variance of mid-parent 
values is greater than it would be if mating had been at random, as will be explained 
in a later section. Each point on the graph represents the mean value of one pair of parents 
(measured along the horizontal axis), and the mean value of their offspring (measured 
along the vertical axis). The axes are marked at intervals of 1/100 mm, and they inter¬ 
sect at the mean value of all parents and all offspring. The sloping line is the linear regres¬ 
sion of offspring on mid-parent. The slope of this line estimates the heritabiiity, and 
has the value (± standard error): h 2 = b 0 p = 0.58 ± 0.07. 
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Fig. 10.1. Regression of offspring on mid-parent for wing length in Drosophila, as explained in 
Example 10.2. Mid-parent values are shown along the horizontal axis, and mean value of 
offspring along the vertical axis. {Drawn from data kindly supplied by Dr E. C. R. Reeve.) 


A complication in the use of the regression of offspring on parents arises if the 
variance is not equal in the two sexes. It was noted in the previous chapter that the 
covariance of offspring and mid-parent values is equal to the additive genetic variance 
on condition that the sexes are equal in phenotypic variance. If the variances are 
not equal, the regression on mid-parent cannot, strictly speaking, be used, and the 
heritability must be calculated separately for each sex. The heritability in males, 
for example, is estimated from the regression of sons on fathers, and of daughters 
on fathers. The regression of daughters on fathers, however, must be adjusted for 
the difference in variation, multiplying it by the ratio of phenotypic standard devia¬ 
tions of males to females. Thus if b is the regression of daughters on fathers, the 
adjusted regression is b' = bo a lo 9 . Similarly, the heritability in females is 
estimated from the regression of daughters on mothers, and of sons on mothers 
adjusted by a q !o a . Estimations from the four regressions, and the adjustments for 
unequal variances, are illustrated in the following example. 

Example 10.3 The heritability of the body weight at 6 weeks of age was estimated 
in a random-bred strain of mice by offspring-parent regression (Falconer, 1973). The 
variances of males and females were not equal, and so the regressions were calculated 
separately for each sex of offspring and of parent. The phenotypic standard deviations 
and their ratios were as follows: 

a a = 3.786, a 9 = 2.675, aja 9 = 1.415, o Q /o a = 0.707 

Table (i) gives the regression coefficients and their standard errors, with the factors by 
which both must be multiplied to adjust for the difference in variance. The regressions 
are all of offspring on one parent, so the regressions and their standard errors must be 
multiplied by two to obtain the heritabilities given in table (ii). The estimates do not 
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Table (i) Regression coefficients with standard errors and adjustment factors 



Parents 


Offspring 

Male 

Female 

Male 

0.110 ± 0.040 

(0.324 ± 0.064) x 0.707 
= 0.229 ± 0.045 

Female 

(0.111 ± 0.029) x 1.415 

0.237 ± 0.043 

■ 

= 0.157 ± 0.041 



Table (ii) Heritabilities, per cent, with 
standard errors. 


Parents 


Offspring 

Male 

Female 

Male 

22 ± 8 

46 ± 9 

Female 

31 ± 8 

47 ± 9 

Both 

27 ± 6 

47 ± 6 


differ significantly according to the sex of offspring, so male and female offspring are 
averaged in the third line of table (ii). The estimates do, however, differ significantly 
between male and female parents. The much higher estimate from females is attributable 
to bias from a maternal effect. 

Sib analysis 

The estimation of heritability from half sibs is more complicated than appears at 
first sight and needs more detailed comment. A common form in which data are 
obtained with animals is the following. A number of males (sires) are each mated 
to several females (dams), the males and females being randomly chosen and ran¬ 
domly mated. A number of offspring from each female are measured to provide 
the data. The individuals measured thus form a population of half-sib and full-sib 
families. An analysis of variance is then made by which the phenotypic variance is 
divided into observational components attributable to differences between the pro¬ 
geny of different males (the between-sire component, < 75 ), to differences between 
the progeny of females mated to the same male (between-dam, within-sires, compo¬ 
nent, (To); and to differences between individual offspring of the same female 
(within-progenies component, o 2 w ). The form of the analysis is shown in Table 
10.3. There are supposed to be s sires, each mated to d dams, which produce k off¬ 
spring each. The values of the mean squares are denoted by MS s , MS D , and MS W . 
The mean square within progenies is itself the estimate of the within-progeny variance 
component, a 2 w \ but the other mean squares are not the variance components. The 
compositions of the mean squares in terms of the observational components of 
variance are shown in the right-hand column of the table, consideration of which 
will show how the variance components are to be estimated. The between-dam mean 
square, for example, is made up of the within-progeny component together with 
k times the between-dam component; so the between-dam component is estimated 
as o 2 D = ( lfk)(MS D — MS W ), Similarly, the between-sire component is estimated 
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Table 10.3 Form of analysis of half-sib and full-sib families 


Source 

d.f 

Mean square 

Composition of 
mean square 

Between sires 

s — 1 

MS s 

= o\/ + ka 2 D + dko 2 s 

Between dams 
(within sires) 

s(d - 1) 

ms d 

= a w 4- ko 2 D 

Within progenies 

sd(k 1) 

MS W 

_ „2 
— a w 


s = number of sires 
d = number of dams per sire 
k = number of offspring per dam 


as 0 's = (1 !dk)(MS s — MS D ), where dk is the number of offspring per sire. If there 
are unequal numbers of offspring from the dams, or of dams in the sire-groups, 
the mean values of k and d can be used with little error, provided the inequality 
of numbers is not very great. The exact solution, which is too complicated for descrip¬ 
tion here, can be found in Snedecor and Cochran (1967, Section 10.19), Turner 
and Young (1969), or Searle (1971). 

The next step is to deduce the connections between the observational components 
that have been estimated from the data and the causal components, in particular the 
additive genetic variance, the estimation of which is the main purpose of the analysis. 
Though all the information needed has already been given, the interpretation of the 
observational components, which is given in Table 10.4, is not immediately apparent 
without explanation. The first point to note is that the estimate of the phenotypic 
variance is given by the sum (gj) of the three observational components: V P — oj 
— a| -f- o\, + aw- This is not necessarily equal to the observed variance as 
estimated from the total sum of squares, though the two seldom differ by much. 
Now consider the interpretation of the between-sire component, g$. This is the 
variance between the means of half-sib families and it therefore estimates the 
phenotypic covariance of half sibs, cov (HS) , which is \V A . Thus of = ? V A . Next 
consider the within-progeny component, g^- Since any between-group variance 
component is equal to the covariance of the members of the groups, it follows that 
a within-group component is equal to the total variance minus the covariance of 
members of the groups. The progenies of the dams are full-sib families and so the 


Table 10,4 Interpretation of the observational components of variance in a sib analysis. 


Observational component Covariance and causal components estimated 


Sires: 

Dams: 

Progenies: 

Total: <r = (7^ + (7^) 4 
Sires + Dams: <r| 


a S = cov (HS) 
a D = cov (FS) ~ C0V (HS) 
o \ = V P — cov (FS) 
a w = Vp 

+ a £> — cov (FS) 


= kv A 

= \V A + iV D + V Ec 

= W A + W D + v Ew 
= Va + v d + Vec + y Ew 
= hV A + ±V D + 
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within-progeny variance estimates V P — cov (FS) . This leads to the interpretation 
o 2 w = i y A + i Vd + Vew- Finally, there remains the between-dam component, 
and what it estimates can be found by subtraction as follows: 

o 2 d = a\ - a 2 s - a\ = cov (FS) - cov (HS) = W A + W D + V Ec 

Consideration of the between-sire and between-dam components will show that their 
sum gives an estimate of the full-sib covariance, coV(p S) , but this provides no new 
information for estimating the causal components. These conclusions about the con¬ 
nection between observational and causal components of variance are summarized 
in Table 10.4. The contributions of the interaction variance to the observational com¬ 
ponents can be deduced from the contributions to the covariances given in Table 9.4. 

The estimation of the heritability from sib analyses is illustrated in the two following 
examples. The calculation of the standard error of the estimate, which is complicated, 
is described by Turner and Young (1969). 

Example 10.4 As an illustration of the estimation of heritability from a sib analysis, 
we refer to the study of Danish Landrace pigs based on the records of the Danish Pig 
Progeny Testing Stations (Fredeen and Jonsson, 1957). The data came from 468 sires 
each mated to 2 dams, the analysis being made on the records of 2 male and 2 female 
offspring from each dam. Only one such analysis is given here: that of body length in 
the male offspring. The analysis, shown in the table, was made within stations and within 
years, and this accounts for the degrees of freedom being fewer than would appear approp¬ 
riate from the numbers stated above. The interpretation of the analysis, shown at the 
foot of the table, has been slightly simplified by the omission of some minor adjustments 
not relevant for us at this stage. The between-dam component is not greater than the 
between-sire component, so there cannot be much non-additive genetic variance or 


Sib analysis of body length in Danish Landrace pigs; data for male offspring only. 


Source 

d.f 

Mean square 

Component of variance 


Between sires 

432 

6.03 

a 2 s = 4 (6.03 - 3.81) = 0.555 


Between dams, 
within sires 

468 

3.81 

o 2 d = i(3.81 - 2.87) = 0.47 


Within progenies 

936 

2.87 

a 2 w = 2.87 

a\ = 3.895 


Interpretation of analysis 

Sib correlations 



Estimates of heritability 


Half sibs: /( HS) = 

<t 2 s 

_ 2 

O y 

0.142 

? 4<j I 

Sire-component: h — —=- 

Oj 

= 0.57 




4a r. 

Dam-component: h = —=- 

Oj 

= 0.48 


al + (fn o 2 (ct! + an) 

Full sibs: t (FS) = - A >, ■ = 0.263 Sire + Dam: h 2 = , ° = 0.53 

aj a j 
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variance due to common environment. The two estimates of the heritability, from the 
sire and dam components respectively, can therefore be regarded as equally reliable, 
and their combination based on the resemblance between full sibs may be taken as the 
best estimate. 

Example 10.5 We have not yet had an example to illustrate the effect of common en¬ 
vironment in augmenting the full-sib correlation. This is provided by body size in mice. 
The analysis given in table (i) refers to the weight of female mice at 6 weeks of age 
* (J■ C. Bowman, unpublished). There were 719 offspring from 74 sires and 192 dams, 
each with one litter. These were spread over 4 generations and the analysis was made 
within generations. The analysis is complicated by the inequality of the number of off¬ 
spring per dam and of dams per sire. The adjustments made for these inequalities are 
shown, without explanation, in the compositions of the mean squares from which the 
components are estimated. The dam component is much larger than the sire component, 
indicating a substantial bias due to common environment or dominance. Therefore only 
the sire component can be used to estimate the heritability. The estimate obtained is h 2 
= 4 x (0.48/5.14) = 0.37. (This estimate has a standard error of 0.26, so that it is 
not significantly different from zero. The experiment was on too small a scale to be of 
much practical use, though it serves to illustrate the method.) The causal components 
can now be estimated from the analysis according to the interpretation given in Table 
10.4. It is not possible to discriminate between common environment and dominance 
as the cause of the difference between the dam and sire components. The estimates in 
table (ii) are based on the assumption that the difference is all due to common environ¬ 
ment, and that V D = 0. We can go a little further than this in the interpretation of the 
analysis and put an upper limit on the dominance variance as follows. The maximum 
possible value for V D is set by the within-progenies component o 2 w : it is possible, though 
very unlikely, that V Ew = 0, which would make <? w - 2<&, from which V D = 1.64 
= 32 per cent as an upper limit. V Ec would then be a 2 D - o\ - \V D = 1.58 = 31 
per cent as a lower limit. The true values of the causal components are, however, likely 
to be much nearer those in table (ii). 


Table (i) 


Source 

d.f 

Mean square 

Composition of M. S. 

Components 

Sires 

70 

17.10 

OfP + k' Op + dk’ (fj 

o 5 = 0.48 

Dams 

118 

10.79 

<*w + ko 2 p 

o 2 p = 2.47 

Progenies 

527 

2.19 

<*w 

o 2 w = 2.19 



k = 3.48; 

k' = 4.16; d = 2.33 

o\ = 5.14 


Table (ii) 


V P = a\ = 5.14 = 100% 

V A = 4 a 2 s = 1.92 = 37% 

v Ec = °d ~ ° 2 s = 1-99 = 39% 

Vew = ~ 2(7 ^ = 1.23 = 24% 


Intra-sire regression of offspring on dam 

The heritability can be estimated from the offspring-parent relationship in a popula¬ 
tion with the structure described in the foregoing section by the regression of off¬ 
spring on dams calculated within sire-groups. That is to say, the regression of 
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offspring on dams is calculated separately for each set of dams mated to one sire, 
and the regression from each set pooled in a weighted average. This regression 
estimates half the heritability, just as would the overall regression with sires ignored. 
The within-sire regression is preferable to the overall regression because it eliminates 
unwanted variation about the regression line due to sires, or to environmental dif¬ 
ferences between sire-groups. It has other advantages, explained by Lush (1940), 
the chief of which is the elimination of environmental covariance that arises if sire- 
groups are in different herds but the dams and progeny in each sire-group are reared 
in the same herd. 

Combined estimates 

In any population with pedigree records it will often be possible to estimate the 
heritability from several different sorts of relatives. It is then obviously desirable 
to make use of all the data by combining the estimates from different relationships, 
suitably weighted. The preferred procedure for doing this is known as REML 
(Restricted Maximum Likelihood). It calculates the heritability that would give the 
greatest likelihood of getting the observed phenotypic values of all the individuals 
in the data. For details see Kennedy (1981), Thompson (1982), Shaw (1987). 

Twins and human data 

Identical twins seem at first sight to provide, for man and cattle, a means of estimating 
the genotypic variance. They provide individuals of identical genotype, just as in- 
bred lines, or crosses between lines, do for laboratory animals or for plants. Many 
studies of human twins have been made, and have shown the members of the pairs 
to be extremely alike in most characters, even when reared apart from childhood. 
Studies of cattle twins, though on a much smaller scale, show the same thing (see 
Hancock, 1954; Brumby, 1958). Taken at their face value, these studies seem to 
indicate a very high degree of genetic determination — up to 90 per cent or even 
more — for many characters. The use of identical twins in this way is, however, 
vitiated by the additional similarity due to common environment. Twins share a com¬ 
mon environment from conception to birth and over the period during which they 
are reared together, so that the between-pair variance contains the variance due to 
common environment, V Ec , confounded with the genetic variance, V G . This diffi¬ 
culty may be partly overcome by comparison of the two sorts of twins, identical 
or monozygotic (MZ) and fraternal or dizygotic (DZ). Dizygotic twins are full sibs 
that share a common environment to approximately the same extent as monozygotic 
twins. To estimate the amount of genetic variance, we ask how much less alike are 
DZ than MZ twins. Table 10.5 shows the composition of the components of variance 


Table 10.5 Composition of the components of variance between and within pairs of 
twins, omitting interaction components. 



Between pairs, a 2 b 

Within pairs, a\ 

Identical (MZ) 

V A + V D + V Ec 

v Ew 

Fraternal (DZ) 

k v A + i V D + V Ec 

2 V A + 4 Vp + V £w 

Difference 

kV A + aV d 

iV A + iV D 
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between and within pairs, on the assumption that both components of the environmen¬ 
tal variance, V Pc and V Pw , are the same in MZ as in DZ twins. The contributions 
from the interaction variance, which are omitted for simplicity, can be added from 
Table 9.4. The difference between MZ and DZ twins in both components estimates 
half of the additive variance together with three-quarters of the dominance variance. 
It may be noted that the difference between the between-pairs mean squares has the 
same expectation as the difference between the components, i.e., (MS mz — MS DZ ) 
— (°bwi) — a l(DZ)) = 2 V A + 4 V D . The correlation between co-twins is the 
between-pair component divided by the phenotypic variance, so twice the difference 
between the MZ correlation and the DZ correlation estimates (V A + 1} V D )/V P . 
This is nearer to the degree of genetic determination (broad-sense heritability) than 
it is to the heritability (narrow-sense), which is perhaps what is wanted from human 
data. It should be noted, however, that the twin analysis does not provide a strictly 
valid estimate of either V A /V P or of V^V P , even with the assumption of equality 
of the environmental components. If the epistatic components are added to the co- 
variances it will be seen that the bias is increased. Example 10.6 illustrates the twin- 
analysis applied to four human characters. 

The analysis of twin data outlined above rests critically on two assumptions. The 
first, already mentioned, is that the environmental components of variance are the 
same in the two types of twins. The second, not yet mentioned, is that the total genetic 
variance is the same in the two types. Furthermore, the object of the analysis is 
to estimate parameters of the population, most of whom are not twins, so for these 
estimates to be valid the environmental components of variance of twins must be 
the same as those of single-born individuals. There are many possible causes of dif¬ 
ferences in the environmental components, of which the following are some (Stern, 
1973, explains and discusses these more fully). 

(1) Genotype—environment interaction: as explained in Chapter 8 , this is formally 
included with the environmental variance. It will contribute different amounts to 
the MZ and DZ environmental components. 

(2) MZ twins are of three types according to the arrangement of the foetal mem¬ 
branes — a single amnion and single chorion, a single chorion, or separate amnions 
and chorions; all DZ twins are of the last type. 

(3) Competition between co-twins in utero, which is probably more severe in MZ 
than in DZ pairs. 

(4) Exact contemporaneity of twins as opposed to singletons. 

(5) Parental treatment of twins, which may either enhance or diminish the similarity, 
and may affect MZ and DZ twins differently. 

( 6 ) Errors in the diagnosis of zygosity. 

(7) The inclusion of unlike-sexed pairs among the DZ twins. 

Differences between the total genetic variance of MZ and DZ twins can arise in 
the following way (Nance, 1976). The frequency of DZ twinning is influenced by 
genetic factors including racial differences, whereas the frequency of MZ twinning 
is little, if at all, influenced by genetic or racial factors. Therefore the different sec¬ 
tions or strata of the population may be differently represented among samples of 
MZ and DZ twins, and the genetic variances may differ in consequence. The re¬ 
quirement of equality of variances may be tested by comparison of the total variances 
estimated as o\ + al, though counterbalancing differences of genetic and en¬ 
vironmental components cannot be ruled out. If the total variances prove to be equal 
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and there is no obvious preference for the between-pair or within-pair comparison, 
the information from the two can be combined by averaging them (Christian et al ., 
1974), i.e., (aJ MZ - a\ DZ ) + ( o 2 wDZ - al MZ ) = V A + \hV D . To estimate 
the ‘heritability’, the value obtained for V A + \jV D is divided by the total variance 
V P . If the total variance of twins is not the same as that of singletons, the ‘herita¬ 
bility’ applicable to the population would be obtained by taking V P from singletons. 

Despite all the difficulties in twin analyses, there is probably less bias from 
inequality of environmental variances than there is from the common-environment 
component V Ec in full-sib correlations. The following example illustrates the point. 

Example 10.6 The table gives the correlations of MZ twins, like-sexed DZ twins, and 
full sibs for four characters, from Huntley (1966). The characters, all measured on 
children, are the total ridge count on ten fingers, height adjusted for age, a verbal IQ 
test, and a social-maturity score which ‘assesses the individual’s ability to look after 
his practical needs and to take responsibility in relation to his age’. These were chosen 
to represent characters that would be expected to be, in the order stated, increasingly 
subject to environmental influences. The ‘heritabilities’ estimated from the twin- 
differences, shown at the foot of the table, are consistent with this expectation. The 
estimates from doubling the full-sib correlation are obviously too high, except for the 
ridge count, being biased upwards by common environment Vec- The twin analyses 
have, at least partially, removed this bias. The heritability of the finger-ridge count has 
been estimated from offspring—parent regressions as about 0.8 (Mi and Rashad, 1975). 
The heritabilities of the counts of single fingers are lower, ranging from 0.58 to 0.68 
for different fingers. The high value for the total count results from the multiple measure¬ 
ment which eliminates all but one-tenth of the environmental component V* affecting 
each finger separately, as explained in Chapter 8. 



Finger-ridge 

count 

Height 

IQ 

score 

Social-maturity 

score 

MZ twins 

0.96 

0.90 

0.83 

0.97 

DZ twins 

0.47 

0.57 

0.66 

0.89 

Full sibs 

0.51 

0.50 

0.58 

0.32 

2(*mz - ? dz) 

0.98 

0.66 

0.34 

0.16 

2/ fs 

1.02 

1.00 

1.16 

0.64 


The effects of common environment present serious difficulties in the interpreta¬ 
tion of the correlations between relatives in man, especially for characters influenced 
by cultural transmission. These difficulties in arriving at a meaningful estimate of 
the heritability cannot be discussed here. It must suffice to say that they may be 
at least partly overcome by utilizing correlations of several different sorts of relatives 
and having an index that quantifies the environment to which each family is subject 
(see Morton, 1974; Rao, Morton, and Yee, 1974, 1976; Elston, 1988 and references 
therein). Methods of estimating the effects of cultural transmission are described 
by Eaves (1976) and by Cloninger, Rice, and Reich (1979). These last authors con¬ 
clude, for example, that the heritability of IQ scores in their data was 33 per cent 
but the ‘total transmissible variance’ was 69 per cent. Rather than trying to estimate 
genetic parameters such as the heritability, it is perhaps more important to test whether 
the parameters are non-zero. This is done by ‘model-fitting’. A ‘model’ is simply 
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a series of expectations for correlations between relatives based on the hypothesis 
to be tested. The hypothesis might be, for example, that there is variation due to 
common environment but no genetic variance. If the data give a significantly bad 
fit to the expectations of the model, the hypothesis is disproved. The application 
of these methods to psychological characters in man is reviewed by Eaves et al. 
(1978). 

Assortative mating 

Assortative mating means mating ‘like with like’ and is seen in a correlation between 
the phenotypic values of mated individuals. Mating in human populations is assor¬ 
tative with respect to many metric characters, such as stature and IQ scores, though 
not necessarily by deliberate choice of mates. The questions to be considered in this 
section are how assortative mating affects the estimation of heritability, and whether 
the use of assortative mating as a deliberate breeding policy has any advantages in 
this respect. The genetic consequences of assortative mating are rather complex and 
only the conclusions can be given here, with no more than brief indications of how 
they are arrived at. Full explanations are given by Crow and Kimura (1970). 

The degree of assortative mating is expressed as the correlation r between the 
phenotypic values of the mated individuals, and this is what can be observed. The 
genetic consequences, however, depend on the correlation m between the breeding 
values of the mates. To deduce the connection between m and r it is necessary to 
know what governs the choice of mates — whether the primary cause of the 
resemblance is phenotypic, genetic, or environmental. Primary phenotypic 
resemblance means that the mates are chosen on the basis of their phenotypic values 
of the character under consideration. This is how assortative mating would be applied 
in a breeding programme. The relationship between the two correlations can then 
be shown to be m = rh 2 , where A 2 is the heritability of the character by which 
the mates are chosen. (The derivation of this relationship will be explained later.) 
The consequences to be described are restricted to primary phenotypic resemblance 
as a cause of assortative mating. Assortative mating in man, however, probably 
seldom arises purely in this way and caution is needed in applying the results to 
human data, particularly in assuming the relationship m = rh 2 to be applicable. 

Primary genetic or primary environmental resemblance occurs if matings take place 
within groups that are differentiated from each other genetically or environmentally. 
This is probably how much of the assortative mating in man arises. The observed 
phenotypic correlation r is then a ‘secondary’ correlation resulting from the ‘primary’ 
correlation of breeding values or of environmental deviations. The primary correla¬ 
tions cannot be deduced from r unless one of them can be estimated by other means, 
and the genetic consequences of the assortative mating cannot be deduced without 
a knowledge of m. If the primary correlation is wholly environmental (m = 0), there 
will be no genetic consequences of the assortative mating (except that the increased 
variance of mid-parent phenotypic values will reduce the regression of offspring 
on mid-parent). Environmental correlation may be the basis of the assortative mating 
for IQ in man. An analysis of family data on IQ scores (Rao, Morton, and Yee, 
1976) showed that the phenotypic correlation between husband and wife of r = 0.5 
could be largely, perhaps wholly, attributed to people choosing a spouse from those 
with a family background similar to their own. 



Assortative mating 


177 


Returning to assortative mating by phenotypic value, the genetic consequences 
are, in summary, as follows. The resulting correlation m between breeding values 
causes an increase of the additive variance, and consequently of the heritability. The 
correlations between relatives, however, are increased by more than would result 
from the increased heritability alone. There is therefore a possible ambiguity in die 
meaning of heritability under assortative mating. It may be thought of as the deter¬ 
mination of the resemblance between relatives, as expressed in equation [10.5], or 
as the ratio of variance components, V A IV P , and the two are not the same under 
assortative mating. The definition as V A IV P will be retained here. The questions with 
which we shall be mainly concerned are: by how much is the heritability increased, 
and how is the heritability (defined as V A IV P ) to be estimated from the resemblance 
between relatives? 

Other aspects of assortative mating that must be noted are the following. (1) The 
full effects are not immediate; it takes some generations following random mating 
to reach an equilibrium state. (2) The effects are dependent on the number of loci 
influencing the character: it will be assumed that the number is large. (3) The effect 
on the dominance variance is small and may be neglected. Attention will be restricted 
to pair-matings producing full-sib families in the progeny. Linkage will be 
disregarded. 

The consequences of assortative mating can be worked out by consideration of 
the covariances of mated pairs, of which three are needed. These are given in Table 
10.6 in terms of the two correlations, r (between phenotypic values) and m (be¬ 
tween breeding values), and the variance components in the generation to which 
the mated pairs belong. The relationship m — rh 2 , stated earlier, can now be deriv¬ 
ed as follows: co\(A l A 2 ) = cov(h 2 P x , h 2 P 2 ) = h 4 co\(PiP 2 ) = h 4 rV p = 
rh 2 V A , by (2) of Table 10.6, cov(/M 2 ) = mV A , so m = rh 2 . It is important to note 
that h 2 here is the heritability of the character measured at the age at which the 
choice of mates takes place. The variance in the progeny is obtained as follows. 
The covariance of breeding values of the parents increases the variance of mid-parent 
breeding values and so increases the variance between family-means. The variance 
within families is due to segregation and is not affected provided the number of 
segregating loci is not small. Adding together the between-family and the within- 
family components gives the total variance in the offspring generation. Equations 
relating the additive and phenotypic variances to those in the random breeding base 
population are given in Table 10.6. In each case the first equation, (4) and (6), refers 
to the offspring of one generation of assortative mating, and the second, (5) and 
(7), to a population that has reached equilibrium, when the variances remain con¬ 
stant. Dividing the additive by the phenotypic variance gives the heritability in equa¬ 
tions (8) and (9). 

With these equations we can answer two questions about the heritability. First, 
by how much does one generation of assortative mating increase the heritability? 
Equation (8) with substitution of m = rh 2 shows that it increases it by a factor of 
(1 + i/to r )/(l + 2/1 or), where hi is the original heritability. The increase may 
be useful in improving the accuracy with which individuals’ breeding values can 
be predicted from their phenotypic values. The increase of the heritability, however, 
is never very great — at most about 10 per cent. For an experiment with Drosophila 
on the use of assortative mating in this way, see McBride and Robertson (1963). 
The second question is this: if we have a population in equilibrium and estimate 
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Table 10.6 Assortative mating. Approximate expressions for variances and covariances. 
(For meanings of symbols, see notes below.) 


Covariances of mates: 
phenotypic values, cov( PyPf) - rV P 
breeding values, co = mV A = mh 2 V P 
breeding value of one with environmental deviation of the other, 

__ covQ4i£ 2 ) = {r - m)V A = (r - m)h 2 V P 

Variances: 

Additive _ Phenotypic _ Heritability 


1 generation y M = w + i m) (4) 
equilibrium V A0 = V A (1 - m) (5) 


Vp\ — Vpo( 1 + imh 2 ) ( 6 ) 
Vpo = Vp( 1 - mh 2 ) (7) 


h i = hi 
ho = h 2 \ 


I + 2tn 
_ 1 + bnh 2 
1 - m 
. 1 - mh 2 _ 


( 8 ) 

(9) 


Relatives: 


Covariance 


Offspring, mid-parent iV A 0 + r) 

Offspring, one parent W A (\ + r) 

Full sibs ±V A (l + m) 


( 10 ) 

( 11 ) 

( 12 ) 


Regression (b) or correlation (t) 

b = h 2 (13) 

b = j* 2 (l + r) (14) 

t = fy 2 (l + m) (15) 


Notes: 

r = correlation between phenotypic values of mates. 

m — correlation between breeding values of mates. When choice of mates is purely by phenotypic 
values, m = rh 2 . 

h 2 = heritability, defined as V A /V P , at the age of mating. 

Covariances of mates: subscripts 1 and 2 refer to the two mated individuals; E includes non¬ 
additive genetic deviations; V A and V P refer to the generation of the mated pairs. 

Variances: subscript 0 refers to the random breeding base, 1 refers to the offspring of 1 
generation of assortative mating. 

Relatives: dominance, V D , and common environment, V Ec , are omitted; V A and h 2 refer to the 
parental generation with correlations r and m. 


the heritability in it, what would the heritability be if the population were mating 
at random? Equation (9) with substitution of m = rh 2 shows that if, for example, 
the heritability were 0.50 under assortative mating of r = 0.5, the random-mating 
heritability would be 0.43. This question is relevant to human populations if com¬ 
parisons are to be made with other species, but equation (9) can be applied to human 
data only if m is known or guessed, because m = rh 2 is probably seldom true for 
the reasons already given. 

A final question to consider is the estimation of the heritability, defined as V A /V P , 
in a population with assortative mating. The covariances of relatives can be worked 
out in the manner described in Chapter 9, taking account of the parental covariances 
given in (1), (2), and (3) of Table 10.6. The covariance and regression or correla¬ 
tion are given for three relationships in Table 10.6. (For the correlations of other 
relatives, see Nagylaki, 1978.) These expressions apply to any generation, V A and 
h 2 being the values in the parental generation. V A is increased over its random 
breeding value, as shown by equation (5), and the covariances are increased by a 
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further factor of (1 + r) for offspring and parents and by (1 + m) for full sibs. 
As before, the offspring—parent covariance is the same for mid-parent values as 
for single parents. The variance of mid-parent values, however, is increased by the 
same amount as the covariance; so the regression of offspring on mid-parent values 
is equal to the heritability, as in a random-breeding population. This conclusion has 
important practical consequences for the estimation of the heritability. Assortative 
mating among the parents does not affect the regression of offspring on mid-parent 
values aftd so the regression provides a valid estimate of the heritability in the popula¬ 
tion from which the parents came. Assortative mating, however, has the advantage 
of increasing the precision of the estimate, the standard error being reduced because 
the variance of mid-parent values is increased, as will be explained in the next section 
(for details see Reeve, 1961). 

The regression of offspring on single parents and the full-sib correlation are both 
affected by assortative mating. The variance of single parents is simply the phenotypic 
variance, but because of the correlation with the unmeasured parent the regression 
of offspring on single parents is increased by the factor (1 + r), and with perfect 
assortative mating (r = 1) it would be the same as the regression on mid-parent 
values. The correlation of full sibs in equation (15) of Table 10.6 omits dominance 
and common environment. If these were assumed to be negligible, the heritability 
could be estimated by substituting m — rh 2 . The equation is then quadratic with 
the solution h 2 = [— 1 + V(1 4- 8rt)]/2r. 

Precision of estimates and design of experiments 

The precision of an estimate of heritability, indicated by its standard error, is easily 
obtained from the standard error of the regression or correlation from which the 
heritability is estimated. Standard errors of heritability estimates are uncomfortably 
large unless the regression or correlation is based on very large numbers, so it is 
important to do everything possible to minimize the standard error. In planning an 
experiment to estimate a heritability, one wants to know how many observations 
are needed to achieve a given degree of precision; and to achieve the greatest pos¬ 
sible precision, within the limitations imposed by the available facilities, one needs 
to know what is the best method and the best design of the experiment. These are 
the problems to be considered now. The choice of method is between regression 
of offspring on one or on both parents, and sib-correlations. The choice of method, 
however, is usually determined more by practical considerations and by freedom 
from bias, than by precision. We shall therefore not give much attention to the com¬ 
parison of methods. The question of design concerns the number of individuals per 
family. The total number of individuals that can be measured is limited by space, 
labour, or cost. Increasing the number of individuals per family therefore reduces 
the number of families. The problem is to find the best compromise between large 
families and many families that will minimize the sampling variance of the regres¬ 
sion or correlation. 

In assessing the relative efficiencies of different methods and designs, we have 
to compare experiments made on the same scale; that is to say, with the same total 
expenditure in labour or cost. We must therefore decide first what are the cir¬ 
cumstances that limit the scale of the experiment. If the labour of measurement is 
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the limiting factor, as for example in experiments with Drosophila, then the limita¬ 
tion is in the total number of individuals measured, including the parents if they 
are measured. If, on the other hand, breeding and rearing space is the limiting fac¬ 
tor, as it generally is with larger animals, the limitation may be either in the number 
of families or in the total number of offspring that can be produced for measure¬ 
ment, and measurements of the parents may be included without additional cost. 

We cannot take account here of all the ways in which the scale of an experiment 
may be limited. For the sake of illustration, the limitation will be taken to be the 
number of individuals that can be measured in one generation, implying that equal 
numbers of parents and offspring can be measured. After finding the optimal design 
for offspring-parent regressions, we shall deal with assortative mating and selec¬ 
tion of parents and with the weighting of families when the number of offspring 
per family is not limited. Then we shall consider the optimal design for estimation 
from sib-correlations. The principles of finding the optimal design are described 
by Latter and Robertson (1960) for offspring—parent regressions and by A. Robertson 
(1959a) for sib-correlations. 


Offspring—parent regression 

Consider first estimates based on the regression of offspring on parents. Let X be 
the independent variate, which may be either the value of a single parent or the mid¬ 
parent value. Let 7 be the dependent variate, which may be either a single offspring 
of each parent or the mean of n offspring. Let o\ and oy be the variances of X and 
7 respectively; let b be the regression of Y on X, and N the number of paired obser¬ 
vations of X and 7, which is equivalent to the number of families in the experiment. 
By rearrangement of the standard formula, the sampling variance of a regression 
coefficient can be expressed as 


<*l = 


N - 2 


fry 

frx 


- b 


For use as a guide to design, this formula is more convenient if put in a simplified 
and approximate form. The regression coefficient is usually small enough that b 2 
can be ignored; and we may suppose that N is fairly large, so that the variance of 
the estimate may be put, approximately, as 


2 _ 1 Oy 

° b ~ N ~^T (approx.) 


... [ 10 . 6 ] 


Equation [10.6] can be expressed in terms of numbers in the following way. Let 
n be the number of offspring measured per family and k the number of parents, 
i.e., 1 or 2. Then, provided the parents are not mated assortatively, the variance 
of single parents, or of mid-parent values, is 


frx 


1 


k 


The variance of the offspring values, oy, is the variance of the observed means of 
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families of n individuals. This depends on the phenotypic correlation t between 
members of families, in a manner that will be explained in Chapter 13 (see Table 
13.3), where it will be shown that 


a 


2 _ 
Y ~ 


1 + (n - 1)/ I7 

v p 

n 


. .. [10.7] 


By .substitution for a\ and a \ in equation [10.6], the sampling variance of the 
regression becomes 


2 fc[l + in - l)r] 
<Jb = — 


nN 


(approx.) 


[ 10 . 8 ] 


This approximate expression for the sampling variance allows one to compare the 
methods — one or both parents measured — and to decide how many offspring should 
be measured per family. 


One parent. Consider first the measurement of only one parent. The denominator, 
nN , of equation [10.8] is the total number of offspring measured. If this is what 
limits the scale of the experiment, then nN is fixed and the sampling variance is 
minimal when n=l,i.e.,(n — l)f = 0. Thus the most efficient design under these 
circumstances is to have as many families as possible and to measure only one off¬ 
spring per family. The standard error of the estimate of the heritability will then 
be as follows: 

s .Q.(h 2 ) = 2 o b = 2/y/N (approx.) ... [10.9] 

To achieve a standard error of 0.1 it is necessary to measure 400 parents and 400 
offspring. This illustrates the fact that large numbers are needed to give estimates 
of even very modest precision. If only 100 families could be measured, the standard 
error would be about 0.2 and no estimates under about 0.4 would be significantly 
different from zero. If the number of offspring per family can be increased without 
reducing the number of families, this will increase the precision. The increase of 
precision, however, depends on the sib-correlation t, because additional offspring 
give more information about the family-mean when the correlation is low. The advant¬ 
age gained can be worked out from equation [10.8] if the value of t is known. 


Both parents. Now consider the measurement of both parents for the regression on 
mid-parent values. If only one offspring was measured per family, substitution of 
k = 2, n = 1 in equation [10.8] gives 

s.e.(/i 2 ) = a b = V(2/A0 (approx.) ... [10.10] 

However, if both parents are measured, the same facilities will allow two offspring 
per family to be measured. Substituting k = 2, n = 2 into equation [10.8] gives 
the standard error of the estimate as V[(l + t)/N ], where t is the full-sib correla¬ 
tion. Comparison will show that, under most circumstances, regression on mid-parent 
gives better precision than regression on one parent. 
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Assortative mating. Mating the parents assortatively increases the precision. Both 
parents must, of course, be measured, so only the regression on mid-parent values 
need be considered. The effect comes from the increase of the variance of mid-parent 
values. The variance of offspring is also increased but not by much, and this will 
be neglected for the sake of simplicity. The variance of mid-parent values under 
assortative mating is k V P (\ + r), where r is the correlation between mates. 
Substituting this for a\ in equation [10.6], and taking ay from equation [10.7] as 
.before, shows that the sampling variance of the regression with assortative mating 
is approximately 1/(1 + r) times the sampling variance with random mating. Thus 
the precision, in terms of standard errors, is increased by a factor of V(1 + r), 
or by V2 if assortative mating is complete, i.e., if r = 1. 

Weighting families of unequal size. It is often possible to measure as many off¬ 
spring as there are in each family without reducing the number of families. The 
number of offspring per family then varies among families and this introduces the 
problem of how to weight the families according to the number of offspring. The 
appropriate weighting depends on the phenotypic correlation t between the offspring 
in the families. The principle of the weighting is that families of size n are weighted 
in proportion to the reciprocal of the variance of the regression that would be ob¬ 
tained if all families were of size n. The weighting is described by Kempthorne and 
Tandon (1953). The following procedure (Falconer, 1963) is a modification which 
adjusts the weights so that families of size n = 1 always have a weight of 1. First, 
the intraclass correlation t must be calculated from an analysis of variance between 
and within families of offspring. Second, the regression coefficient to be estimated 
must be guessed at, or estimated approximately from unweighted means of families. 
The weight w n to be given to the mean of n offspring is then 

w„ = in + nT)/( 1 + nT) ... [10.11] 

where T = (t — b 2 )/( 1 — t ) for regression on single parents, and T = (t — 

2 b 2 )/(\ — /) for regression on mid-parent values. The weighting does not have 
much effect on the precision unless n varies substantially. Bohren, McKean, and 
Yamada (1961) examine its merits. 

Sib analyses 

Now let us consider estimates obtained from the intraclass correlation of full-sib 
or half-sib families. We shall at first suppose for simplicity that half-sib families 
are not subdivided into full-sib families, i.e., that only one offspring from each dam 
is measured in paternal half-sib families. Let N be the number of families, and n 
the number of individuals per family, so that the total number of individuals measured 
is T = nN. Let the intraclass correlation be t. The sampling variance of the intraclass 
correlation is then 


2 _ 2[1 + (n - 1)?] 2 (1 - t) 2 
r n(n — l)(N — 1) 


[ 10 . 12 ] 


When the value of T = nN is limited by the size of the experiment, it can be shown 
that the sampling variance of the intraclass correlation is minimal when n = lit. 
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giproximately. Thus the optimal family size depends on the correlation and therefore 
0n the heritability. Assuming that variance due to dominance and common environ¬ 
ment are negligible, with full sibs t = h 2 12, and with half sibs t = /i 2 /4. So the 
bmily sizes giving the most efficient design are n = 2/h 2 for full-sib families, and 
n = 4/h 2 for half-sib families. In the case of half sibs we are assuming that only 
,one offspring per dam is measured, so n is the optimal number of dams per sire. 
Since prior knowledge of the heritability will be at the best only approximate, the 
optimal family size cannot be exactly determined before-hand. The loss of effic¬ 
iency, however, is much greater if the family size is below the optimum than if it 
is above. It is therefore better to err on the side of having too large families. A. 
Robertson (1959a) shows that, in the absence of prior knowledge of the heritabil¬ 
ity, half-sib analyses should generally be designed with families of between 20 and 30. 

The sampling variance of the correlation when the experiment has the optimal 
design is obtained by substituting n = 1 It in equation [10.12]. Making some 
approximations, this leads to 

a 2 = &t/T (approx.) ... [10.13] 

To get the sampling variance of the heritability, the variance of the full-sib correla¬ 
tion must be multiplied by 4, and the variance of the half-sib correlation by 16. Then, 
by substituting t = h 2 tl in equation [10.13], the sampling variance of the herit¬ 
ability estimated from full sibs becomes 

= 4 a 2 = \6h 2 IT (approx.) ... [10.14] 

And, by substituting t = h 2 1 4 in equation [10.13], the sampling variance of the 
estimate from half sibs becomes 

a \ 2 = 16a 2 = 32 h 2 lT (approx.) ... [10.15] 

Thus, other things being equal, an estimate from full-sib families is twice as precise, 
in terms of their variances, as one from half-sib families. 

It is sometimes desirable to design an experiment for estimating the heritability 
both from offspring—parent regression and from sib-correlation. Hill and Nicholas 
(1974) show that the optimal design does not differ much from what would be the 
best for either method alone. 

Selection of parents 

In experimental populations and in farm animals the parents used are often a selected 
group. They may be selected on the basis of the character whose heritability is being 
estimated, or on the basis of some other character correlated with it. The selection 
causes the variance between parents to be reduced and consequently the covariance 
of sibs to be reduced. As a result, the heritability estimated from intraclass correla¬ 
tions is biased downwards, and can be as much as 50 per cent below its true value 
(see Ponzoni and James, 1978). If the selection of parents is based on the character 
whose heritability is being estimated, it does not affect the regression of offspring 
on parents, either single parents or mid-parent values, but it reduces the precision 
because it reduces the variance of the parents, a\, in equation [10.6] (see A. 
Robertson, 1977a). Selection can, however, improve the precision if two groups 
of parents are selected, one with high values and one with low values, and offspring 
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are reared only from these selected groups. The gain in precision comes from devoting 
all the available facilities to the more extreme families, which give the most infor¬ 
mation about the regression. When equal numbers of offspring and selected parents 
are measured, the optimal proportion of parents to select in each group is about 5 
per cent. For details see Hill and Thompson (1977). 


Problems 

10.1 What would be the heritability estimated from each of the following correla¬ 
tions or regressions, assuming that resemblance due to environment or dominance 


was negligible? 

(1) Regression of offspring on father =0.21 

(2) Regression of offspring on mother = 0.27 

(3) Correlation of full sibs = 0.34 

(4) Regression of offspring on mean of parents =0.32 

(5) Correlation of half sibs = 0.02 

(6) Regression of female offspring on mother’s sister =0.03 

(7) Regression of daughters on dams, within sires = 0.09 

[Solution 48] 


10.2 The following data were obtained in a study of the adult height of people in 
two West African villages. Female heights were adjusted to male equivalents so that 
the means were the same in males and females. Derive what you regard as the most 
reliable estimate of the heritability, and its standard error. What other component 
of covariance, in addition to V A , can be derived from the data? 



Offspring—parent regressions ± standard errors 

Father 

Mother 

Mid-parent 

Sons 

0.323 ± 0.058 

0.454 ± 0.057 

0.705 ± 0.085 

Daughters 

0.291 ± 0.044 

0.420 ± 0.048 

0.683 ± 0.063 

Both 

0.303 ± 0.036 

0.424 ± 0.038 

0.654 ± 0.052 

Standard deviations (inches): ] 

Males 2.5; Females 2.3 



Data from Roberts, D.F. et al. (1978) Ann. Hum. Genet., 42, 15-24. 

[Solution 58] 

10.3 In the study used for Problem 10.2 the sib correlations given below were also 
obtained, the marriage customs of the people providing both paternal and maternal 
half sibs. The sexes did not differ in their correlations and are combined. Use these 
data to partition the variance of height in this population. (Standard errors of the 
components are not given in the solution.) How do the components estimated here 
compare with those estimated in Problem 10.2? 
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Full sibs 0.406 ± 0.035 

Paternal half sibs 0.140 ± 0.056 
Maternal half sibs 0.257 ± 0.101 

[Solution 68] 

10.4 Show that the intraclass correlation of twins can be estimated from an analysis 
of variance by t = (B - W)!(B + W ), where B and W are the mean squares 
between pairs and within pairs respectively. 

[Solution 78] 

10.5 Skin-fold thickness provides a useful measure of (human) fatness. The table 
gives the correlation of skin-fold thickness in twins aged under 10 and between 10 
and 15. Estimate the heritabilities in the two groups. What can be deduced about 
resemblance due to common environment? 



Under 10 

10-15 

MZ 

0.64 

0.91 

DZ 

0.38 

0.42 


Data from Brook, C.G.C. et al . (1975) Brit. Med. J., 1975, 2, 719—21. 

[Solution 88] 

10.6 The table gives the mean squares from an analysis of variance of a random¬ 
breeding population of Drosophila melanogaster. The character was the number of 
sternopleural bristles. These are bristles on the sides of the thorax, the numbers on 
one side being counted. There were 62 males (sires) each mated to 3 females (dams). 
Each female laid eggs in one vial. The bristles of 10 male and 10 female offspring 
of each dam were counted. Calculate the correlation of half sibs and of full sibs, 
and estimate the components of variance that can be separated by these data. 



Males 

Females 

Between sires 

3.894 

4.461 

Between dams within sires 

2.198 

2.061 

Within dams 

1.125 

0.893 


Data from Sheridan, A. K. et al. (1968) Theor. Appl. Genet., 38, 179—87. 

[Solution 98] 

10.7 Show that if the correlation of full sibs is 0.75, the heritability of the character 
cannot be greater than 0.5. The full-sib correlation of weaning weight in mice is 
0 .8; what is the maximum heritability compatible with this? 


[Solution 108] 



186 


10 Heritability 


10.8 A study of morphological variation in a population of Geospiza fortis , one 
of Darwin’s Finches in the Galapagos, provides the following data on the depth of 
the bill. How would you interpret these data? 


Regressions, ± s.e. 

Offspring—midparent 

0.82 

Offspring—father 

0.47 

Offspring—mother 

0.48 

'Correlations, ± s.e. 

Full sibs 

0.71 

Father—mother 

0.33 


± 0.15 
± 0.17 
± 0.13 

± 0.12 


Data from Boag, P. T. & Grant, P. R. (1978) Nature, 274, 793-4. 


[Solution 118] 


10.9 Suppose that a population has bred with assortative mating for long enough 
to reach equilibrium. There is a correlation of r = 0.4 between mates, the choice 
of mates being based purely on phenotypic values of a particular character. The cor* 
relation of full sibs with respect to this character is t = 0.3. The character is known 
to have negligible dominance variance and to be negligibly affected by environment 
common to sibs. What is the estimate of the heritability in the population, and what 
would the heritability be if there were no assortative mating? 


[Solution 128] 

10.10 If the heritabilities given below were estimated by the methods indicated, 
and in every case the total number of individuals measured was 400, what would 
be approximately the standard errors of the estimates, assuming that there was no 
environmental resemblance between sibs, and that the data came from unselected 
individuals of a random breeding population? 

(1) h 2 = 0.5 from regression of sons on fathers; 200 fathers. 

(2 ) h 2 = 0.6, from regression of the mean of 3 offspring (full sibs) on the mean 
of their parents. 

(3) h 1 = 0.4, from correlation of full sibs in families of 5. 

(4) h 2 = 0.2, from correlation of half sibs in 20 families of half sibs with no full 
sibs among them. 


[Solution 138] 




SELECTION: 

I. The response and its prediction 


Up to this point the treatment of metric characters has been mainly concerned with 
the description of the genetic properties of a population as it exists under random 
mating, with no influences tending to change its properties; now we have to con¬ 
sider the changes brought about by the action of a breeder or experimenter. There 
are two ways in which the action of the breeder can change the genetic properties 
of the population; the first by the choice of individuals to be used as parents, which 
constitutes selection, and the second by control of the way in which the parents are 
mated, which embraces inbreeding and cross breeding. Selection in one form or 
another is the means whereby all improvement of domesticated animals and plants 
has been made. In this chapter, therefore, we start consideration of the most impor¬ 
tant application of quantitative genetics. Selection means breeding from the ‘best’ 
individuals, whatever ‘best’ may be. The ways in which the theory of quantitative 
genetics can help in this are, first, by showing how to choose individuals with the 
best breeding values and, second, by predicting the outcome so that different breeding 
schemes can be compared. The simplest form of selection is to choose individuals 
on the basis of their own phenotypic values. This is the form of selection to be con¬ 
sidered in this chapter. Chapter 13 will show how information from relatives can 
help to identify individuals with the best breeding values. Experimental selection 
in laboratory animals provides a means of studying the genetics of metric characters. 
This aspect of selection will be dealt with in the next chapter. 

In any practical selection programme the number of parents used is more or less 
restricted, with the result that some inbreeding inevitably takes place, and its effects 
are superimposed on those of the selection. Any inbreeding effects that there may 
be will at first be ignored, but they will have to be taken into consideration later. 

The basic effect of selection is to change the array of gene frequencies in the man¬ 
ner described in Chapter 2. The changes of gene frequency themselves, however, 
are now almost completely hidden from us because we cannot deal with the individual 
loci concerned with a metric character. The effects of selection that can be observed 
are therefore restricted mainly to changes of the population mean. Let us, however, 
consider the underlying changes of gene frequencies a little further in general terms. 

To describe the change of the genetic properties from one generation to the next 
we have to compare successive generations at the same point in the life-cycle of 
the individuals, and this point is fixed by the age at which the character under study 
is measured. Most often the character is measured at about the age of sexual maturity 



188 


11 Selection 


or on the young adult individuals. The selection of parents is made after the 
measurements, and the gene frequencies among these selected individuals are dif¬ 
ferent from what they were in the whole population before selection. If there are 
no differences of fertility among the selected individuals or of viability among their 
progeny, then the gene frequencies are the same in the offspring generation as in 
the selected parents. Thus artificial selection — that is, selection resulting from the 
action of the breeder in the choice of parents — produces its change of gene fre¬ 
quency by separating the adult individuals of the parent generation into two groups, 
the selected and the discarded, that differ in gene frequencies. Natural selection, 
operating through differences of fertility among the parent individuals, or of viability 
among their progeny, may cause further changes of gene frequency between the 
parent individuals and the individuals on which measurements are made in the off¬ 
spring generation. Thus there are three stages at which a change of gene frequency 
may result from selection: the first through artificial selection among the adults of 
the parent generation; the second through natural differences of fertility, also among 
the adults of the parent generation; and the third through natural differences of viability 
among the individuals of the offspring generation. Though natural differences of 
fertility and viability are always present, they are not necessarily always relevant, 
because they are not necessarily connected with the genes concerned with the metric 
character. 

Response to selection 

The change produced by selection that chiefly interests us is the change of the popula¬ 
tion mean. This is the response to selection, which will be symbolized by R; it is 
the difference of mean phenotypic value between the offspring of the selected parents 
and the whole of the parental generation before selection. The measure of the selec¬ 
tion applied is the average superiority of the selected parents, which is called the 
selection differential, and will be symbolized by S. It is the mean phenotypic value 
of the individuals selected as parents expressed as a deviation from the population 
mean, that is from the mean phenotypic value of all the individuals in the parental 
generation before selection was made. To deduce the connection between response 
and selection differential, let us imagine two successive generations of a population 
mating at random, as represented diagrammatically in Fig. 11.1. Each point represents 
a pair of parents and their progeny, and is positioned according to the mid-parent 
value measured along the horizontal axis and the mean value of the progeny measured 
along the vertical axis. The origin represents the population mean, which is assumed 
to be the same in both generations. The sloping line is the regression line of off¬ 
spring on mid-parent. (A diagram of this sort, plotted from actual data, was given 
in Fig. 10.1.) Now let us regard a group of individuals in the parental generation 
as having been selected — say those with the highest values. These pairs of parents 
and their offspring are indicated by solid dots in the figure. The parents have been 
selected on the basis of their own phenotypic values, without regard to the values 
of the progeny or of any other relatives. Let S be the mean phenotypic value of these 
selected parents, expressed as a deviation from the population mean. And similarly 
let R be the mean deviation of their offspring from the population mean. Then S 
is the selection differential and R is the response. The point marked by the cross 
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Fig. 11.1, Diagrammatic representation of the mean values of progeny plotted against the mid¬ 
parent values, to illustrate the response to selection, as explained in the text. 

represents the mean value of the selected parents and of their progeny, and its expected 
position is on the regression line as shown. Thus the ratio R/S is equal to the slope 
of the regression line. The connection between the response and selection differen¬ 
tial is therefore given by 

R = b 0 pS ...[11.1] 

We saw in the last chapter that the regression of offspring on mid-parent is equal 
to the heritability, provided there is no non-genetic cause of resemblance between 
offspring and parents. To this we must add the further condition that there should 
be no natural selection: that is to say, that fertility and viability are not correlated 
with the phenotypic value of the character under study. Provided these conditions 
hold, therefore, the ratio of response to selection differential is equal to the heritability, 
and the response is given by 

R = h 2 S ...[11.2] 

The connection between the response and the selection differential, expressed in 
equation [11.2], follows directly from the meaning of the heritability. We noted in 
the last chapter (equation [10.2]) that the heritability is equivalent to the regression 
of an individual’s breeding value on its phenotypic value. The deviation of the pro¬ 
geny from the population mean is, by definition, the breeding value of the parents, 
and so the response is equivalent to the breeding value of the parents. Thus it follows 
that the expected value of the progeny is given by R = h 2 S. 

There is one point at which the situation envisaged in deducing the equations of 
response does not coincide with what is actually done in selection. We supposed 
the individuals of the parent generation to have mated at random and the selection 
to have been applied subsequently. In practice, however, the selection is usually 
made before mating, on the basis of the individuals’ values and not the mid-parent 
values. The effect of this is that the individuals, when regarded as part of the whole 
parental population, have been mated assortatively. Assortative mating, however, 
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has very little effect on the mid-parent regression, as we noted in the last chapter, 
and this feature of selection procedure can therefore be disregarded. 

Another point that should be mentioned concerns the linearity of the regression. 
It is drawn as a straight line in Fig. 11.1 and assumed to be linear in equation [11.1]. 
In most circumstances this assumption is justified and equation [11.1] is valid to 
a near approximation. Non-linearity could be produced by dominance. Considera¬ 
tion of Fig. 7.2 will show that if all the variance were due to a single locus with 
dominance, the regression of breeding value on genotypic value would be non-linear. 
However, when there are more than a few loci the distribution of genotypic values 
becomes nearly normal, as can be seen from Fig. 6.1, and the regression is effec¬ 
tively linear (Gimelfarb, 1986). For a discussion of non-linear regressions see also 
Robertson ( \911b ). 

Prediction of response 

The chief use of these equations of response is for predicting the response to selec¬ 
tion. Let us consider a little further the nature of the prediction that can be made. 
First, it is clear that equation [11.1] is not a prediction but simply a description, 
because the regression of offspring on parent cannot be measured until the offspring 
generation has been reared. The equation R = h^S, however, provides a means of 
prediction from knowledge of the heritability obtained from previous generations. 
The heritability for use in the prediction can be estimated by any method, such as 
a sib-correlation, and does not have to be estimated from the offspring-parent regres¬ 
sion. The selection differential S cannot be known till after the parents have been 
selected, but its expected value can be predicted, as will be explained in the next 
section. The following example illustrates the calculation of the selection differen¬ 
tial and response, and the prediction of the response by equation [11.2]. 

Example 11.1 The data in the table come from the experiment of Clayton, Morris, 
and Robertson (1957) on selection for abdominal bristle number in Drosophila 
melanogaster. The heritability of bristle number was first estimated in the base popula¬ 
tion before selection and found to be 0.52, as stated in Example 10.1. The parents selected 
for high bristle number had a mean superiority of S = 40.6 - 35.3 = 5.3 bristles. The 
predicted response, by equation [11.2], is 0.52 x 5.3 = 2.8. The observed response 
was 37.9 - 35.3 = 2.6 bristles. 


Generation 

Mean of all 
measured 

Mean of those 
selected 

Selection 

differential 

Response 

Exp. 

Obs. 

Parents 

35.3 

40.6 

5.3 

2.8 


Offspring 

37.9 

— 

- 


2.6 


The prediction of response is valid, in principle, for only one generation of selec¬ 
tion. The response depends on the heritability of the character in the generation from 
which the parents are selected, so responses in later generations cannot, strictly speak¬ 
ing, be predicted without redetermining the heritability in each generation. There 
are two reasons why the heritability is expected to change. First, if there is a response 
the gene frequencies must change, and the heritability depends on the gene frequen¬ 
cies. This change is not likely to be apparent for some considerable time because 
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gene frequency changes are small unless only a few loci are involved. Second, the 
selection of parents reduces the variance and the heritability. This takes place in 
the early generations. It will be explained briefly later and will be ignored mean¬ 
time. These expected changes in the heritability are not large, however, and 
experiments have shown that the response is usually maintained with little change 
over several generations — up to five, ten, or even more. This will be seen in the 
graphs of responses to selection given later in this chapter and in the next. 

Selection differential and intensity of selection 

The selection differential can be predicted in advance provided that two conditions 
hold: the phenotypic values of the character being selected are normally distributed, 
and selection is by truncation. Truncation selection means that individuals are chosen 
strictly in order of merit as judged by their phenotypic values, no individual being 
selected that is less good than any of those rejected. Under these conditions the selec¬ 
tion differential depends only on the proportion of the population included among 
the selected group, and the phenotypic standard deviation of the character. The 
dependence of the selection differential on these two factors is illustrated diagram- 
matically in Fig. 11.2. The graphs show the distribution of phenotypic values, which 
is assumed to be normal. The individuals with the highest values are supposed to 
be selected, so that the distribution is sharply divided at a point of truncation, all 
individuals above this value being selected and all below rejected. The arrow in each 
figure marks the mean value of the selected group, and S is the selection differen¬ 
tial. In graph (a) half the population is selected, and the selection differential is rather 
small: in graph (b) only 20 per cent of the population is selected, and the selection 
differential is much larger. In graph (c) 20 per cent is again selected, but the character 
represented is less variable and the selection differential is consequently smaller. 
The standard deviation in (c) is half as great as in (b) and the selection differential 
is also half as great. 

The standard deviation, which measures the variability, is a property of the 
character and the population, and it sets the units in which the response is expressed, 
i.e., so many pounds, millimetres, bristles, etc. The response to selection may be 



Fig. 11.2. Diagrams to show how the selection differential, S, depends on the proportion of the 
population selected, and on the variability, of a normally distributed character. All the 
individuals in the stippled areas, beyond the points of truncation, are selected. The axes are 
marked in hypothetical units of measurement. 

(a) 50 per cent selected; standard deviation 2 units: 5= 1.6 units 

(b) 20 per cent selected; standard deviation 2 units: S = 2.8 units 

(c) 20 per cent selected; standard deviation 1 unit : S = 1.4 units 
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generalized if the selection differential is expressed in terms of the phenotypic stan¬ 
dard deviation, o P . This standardized selection differential S/a P is called the intensity 
of selection , symbolized by i. Then the selection differential is 

S ~ io p 

and the expected response in equation [11.2] becomes 

• R = ih 2 o P ...[11.3] 

By noting that h = o A /o P , where o A is the standard deviation of breeding values 
(square-root of the additive genetic variance), we may write this equation in the form 

R = iho A ...[11.4] 

which is sometimes used in comparisons of different methods of selection. 

The intensity of selection, i, depends only on the proportion of the population 
included in the selected group and, provided the distribution of phenotypic values 
is normal, it can be determined from tables of the properties of the normal distribu¬ 
tion. If p is the proportion selected, i.e., the proportion of the population falling 
beyond the point of truncation, and z is the height of the ordinate at the point of 
truncation, then it follows from the mathematical properties of the normal distribu¬ 
tion that 


S . z 

~ = 1 = — ... [11.5] 

o P p 

Thus, given only the proportion selected, p, we can find out by how many standard 
deviations the mean of the selected individuals will exceed the mean of the popula¬ 
tion: that is to say, the intensity of selection, i. The graphs in Fig. 11.3 show the 
relationship between i and p. Values of i for given values of p are tabulated in 
Appendix Table A. The relationship between i and p given in equation [11.5] ap¬ 
plies, strictly speaking, only to a large sample: that is to say, when a large number 
of individuals have been measured, among which the selection is to be made. When 
selection is made out of a small number of measured individuals, the mean devia¬ 
tion of the selected group is a little less. The intensity of selection can be found from 
tables of deviations of ranked data (Table XX of Fisher and Yates, 1963). The two 
lower curves in Fig. 11.3 show the intensity of selection when selection is made 
from samples of 20 and of 10 measured individuals. Appendix Table B gives some 
values of i when selection is made from small numbers. 

Example 11.2 A comparison of the expected and observed responses under different 
intensities of selection was made by Clayton, Morris, and Robertson (1957), studying 
abdominal bristle number in Drosophila. The heritability was first determined by three 
methods which yielded a combined estimate of 0.52 (see Example 10.1). The standard 
deviation of bristle number (average of the two sexes) was 3.35. Selection at four dif¬ 
ferent intensities was carried on for five generations, both upward and downward (i.e., 
both for increased and for decreased bristle number). In each case 20 males and 20 females 
were selected as parents, the intensity being varied by the number out of which these 
were selected, as shown in the first column of the table. The intensities of selection cor¬ 
responding to these proportions selected are given in the second column of the table. 
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Fig. 11.3. Intensity of selection in relation to proportion selected. The intensity of selection is the 
mean deviation of the selected individuals, in units of phenotypic standard deviations. The 
upper graph refers to selection out of a large total number of individuals measured: the lower 
two graphs refer to selection out of totals of 20 and 10 individuals respectively. A normal 
distribution is assumed. 


The expected responses are then found from equation [11.3]. Under the most intense 
selection, for example, it is R = 1.4 X 3.35 X G.52 = 2.44. The observed responses 
are given in the last two columns of the table. Although they do not agree very precisely 
with expectation, they show how the change made by selection falls off as the intensity 
of selection is reduced. 




Mean response per generation 





Observed 


Proportion 

Intensity of 




selected, p 

selection, i 

Expected 

Up 

Down 

20/100 = 0.20 

1.40 

2.44 

2.62 

1.48 

20/75 = 0.267 

1.23 

2.14 

2.20 

1.26 

20/50 = 0.40 

0.97 

1.65 

1.46 

0.79 

20/25 = 0.80 

0.34 

0.59 

0.28 

-0.08 


The selection differential, S in equation [11.2], and the intensity of selection, i 
in equation [11.3], refer to the mean superiority of all the parents used. Males and 



194 


11 Selection 


females may differ in the amount of selection that can be applied to them. Some 
characters, for example, can be measured only on one sex, so that no selection can 
be applied to the other sex. If the selection applied to males and females differs, 
the values of S or i to be used are the unweighted means for the two sexes, i.e., 

S = k(S m + S f ) ... [11.6a] 

or i = Hi m + if) ...[11.66] 

Thus if only females, for example, can be selected, S = \ Sf, and the expected 
response is R = \ h 2 Sf. This can be related to equation [11.1] by noting that i h 2 is 
the regression of offspring on single parents. The sexes may also differ in the numbers 
used as parents. The value of S or i is then again as given in equations [11.6] because 
half the genes in the offspring come from each sex of the parents regardless of the 
numbers. 

Improvement of response 

The ways in which the breeder might improve the rate of response can be seen from 
the equation R = ih 2 o p . The phenotypic standard deviation, a P , merely specifies 
the units of measurement. The heritability can be increased by reducing the 
environmental variation through attention to the techniques of rearing and manage¬ 
ment, by multiple measurements when these are possible, and to a small extent by 
assortative matings as explained in the last chapter. Increasing the intensity of selection 
seems at first sight to be a straightforward way of improving the response, but there 
are two factors that limit what the breeder can do in this way. First, the reproduc¬ 
tive rate of the organism limits the intensity of selection because the proportion 
selected for breeding can never be less than the proportion needed for replacement. 
That is to say, two individuals are needed on average to replace each pair of parents. 
So the more prolific the organism the more intense the selection that can be applied. 
If males mate with more than one female, males have more offspring than females, 
and selection can be more intense on males than on females. Suppose, for example, 
that each male mates with 10 females, and the females have on average 5 daughters 
each. To allow for replacement of the females the proportion of females selected 
cannot be less than 1/5, but males have on average 50 sons, allowing selection of 
1/50 to replace the males. The upper limits of the intensity of selection in this case 
would be if = 1.40 for females and i m — 2.42 for males, giving a net intensity of 
i = 1.91 by equation [11.66]. The second factor that limits the intensity of selection 
is the consequence of population size and inbreeding. Inbreeding almost always 
reduces reproductive fitness and characters related to it, as will be explained in 
Chapter 14. So the number of parents used must be large enough to keep this 
inbreeding depression to an acceptable level when balanced against the gain from 
selection. In experimental work, for example, one might decide to use fewer than 
10 or 20 pairs of parents. When the number of parents to be used is fixed in this 
way, the intensity of selection can be increased only by measuring more individuals 
out of which to select the parents. 

Generation interval. The number of offspring available for selection depends not 
only on the parents’ reproductive rate but, in many organisms, also on how long 
the breeder is willing to wait before he makes the selection. The progress per unit 
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of time is usually more important than progress per generation which has been dealt 
with so far. The interval of time between generations is therefore an important fac¬ 
tor in reckoning the response to selection. By waiting until more offspring have been 
reared before he makes the selection, the breeder can increase the intensity of selec¬ 
tion and the response per generation, but in doing so he inevitably increases the 
generation interval and may thereby reduce the response per unit of time. There 
is thus a conflict of interest between intensity of selection and generation interval, 
and the best compromise must be found between the two. Increasing the number 
of offspring will pay up to a certain point, and beyond this point it will not. The 
optimal number of offspring cannot be stated in general terms, and each case must 
be worked out according to its special circumstances. 

In reckoning the generation interval under any scheme of selection, distinction 
must be made between discrete and overlapping generations. When generations are 
discrete the offspring are kept till the last-born are mature; selection is then made 
and the selected individuals are all mated at more or less the same time. The genera¬ 
tion interval is the interval between the matings made in successive generations. 
When generations overlap, replacement of the parents by selected offspring is a more 
or less continuous process, the selected offspring being mated as soon as they are 
mature. The generation interval can be calculated as the average age of the parents 
at the birth of their selected offspring. The problem is to find the optimal age for 
discarding the parents. Example 11.3 below illustrates the calculation of the optimal 
procedure both when generations are discrete and when they are overlapping. When 
fewer male parents are used than female, the generation intervals of males and 
females, L m and L f , must be distinguished as well as the intensities of selection, i m 
and if. What has to be maximized is the ratio (i m + if)/ (L m + Lf). The solution, 
which can be found graphically, is explained by Ollivier (1974), where the solu¬ 
tions for the main farm animals are given. Sometimes it is necessary to distinguish 
four intensities of selection and generation intervals, according to whether the male 
and female parents are used to breed sons or daughters. The overall ratio i/L is then 
calculated as LULL, where Li is the sum of the four intensities and LL is the sum 
of the four generation intervals (Rendel and Robertson, 1950). 

Example 11.3 Let us suppose that selection is to be applied to some character in mice, 
and that speed of progress per unit of time is the aim. The question is: how many litters 
should be raised? To find the number of litters that will give the maximum speed of 
progress, we have to find the intensity of selection and the generation interval. The ratio 
of the two will then give the relative speed. The actual speed could be obtained by multiply¬ 
ing by the heritability and the standard deviation, but these factors will be assumed to 
be independent of the number of litters raised. A comparison of the expected rates of 
progress per week is made in the table. The comparison is made for two different average 
sizes of litter, meaning the number of young reared per litter. It is assumed that the 
character to be selected can be measured before sexual maturity, and that first litters 
are born when the parents are 9 weeks old, subsequent litters following at intervals of 
4 weeks. It is assumed also that the population is large enough to be treated as a large 
sample in reckoning the intensity of selection, and that equal numbers of males and females 
are selected. 

The generation interval depends on the procedure for selection and mating. Two dif¬ 
ferent procedures are considered. First, selection is deferred till all the litters have been 
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measured. All the selected mice are then mated at the same time, and generations are 
discrete. The generation interval, tabulated under L, is the age of the parents at the birth 
of the last litter to be raised. This is a realistic procedure for laboratory experiments. 
Second, the mice required are selected equally from all the litters raised. For example, 
if two litters are raised, one per litter is selected from first litters and one per litter from 
second litters, making a total of two per family. The intensity of selection is the same 
as by the first procedure provided the total numbers are large. The selected mice are 
mated as soon as they are mature, and generations are therefore overlapping. The genera¬ 
tion interval, tabulated under L, is the parents’ mean age at the birth of their litters. 
This procedure is more realistic for a practical breeding programme. 

The optimal number of litters is shown by the maximal value of the ratio i/L or i/L. 
It differs according to the litter size and the procedure. With the first procedure, if 6 
young are raised per litter the maximum rate of response is attained by rearing only one 
litter; if 4 young are reared it is worth while to wait for second litters but not for third 
litters. With the second procedure, if 6 young are reared the parents should be discarded 
after their second litters; if 4 young are reared, the rate of progress is the same when 
parents are discarded after their second or their third litters. These conclusions could 
hardly have been guessed at without the computations shown in the table. 


N 

L 

L 

n = 6 




n = 4 




P 

i 

i/L 

i/L 

P 

i 

i/L 

i/L 

1 

9 

9 

0.333 

1.10 

0.122 

0.122 

0.500 

0.80 

0.089 

0.089 

2 

13 

11 

0.167 

1.50 

0.115 

0.136 

0.250 

1.27 

0.098 

0.115 

3 

17 

13 

0.111 

1.71 

0.101 

0.132 

0.167 

1.50 

0.088 

0.115 

4 

21 

15 

0.083 

1.85 

0.088 

0.123 

0.125 

1.65 

0.079 

0.110 


n = number of young reared per litter 
N = number of litters raised 
L = generation interval, in weeks, to last litter 
L ~ mean generation interval, all litters (See text). 
p = proportion selected 
i = intensity of selection. 


Measurement of response 

When one or more generations of selection have been made, the measurement of 
the response actually obtained introduces several problems. These are matters of 
procedure rather than of principle and will be only briefly discussed. 


Variability of generation means 

The first problem arises from the variability of generation means. Inspection of any 
graph of selection shows that the generation means do not progress in a simple regular 
fashion, but fluctuate erratically and more or less violently. The consequence of 
this variation between generation means is that the response can seldom be measured 
with any pretence of accuracy until several generations of selection have been made. 
The best measure of the average response per generation is then obtained from the 
slope of a regression line fitted to the generation means, as illustrated in the follow¬ 
ing example, the assumption being made that the true response is constant over the 
period. 
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Example 11.4 Figure 11.4 shows the results of 11 generations of two-way selection 
for body weight in mice (Falconer, 1953). On the left the ‘up’ and ‘down’ lines are shown 
separately, and on the right the divergence between the two is shown. Linear regression 
lines are fitted to the observed generation means. (The first generation of selection is 
disregarded because the method of selection was different.) The estimates of the average 
response per generation, with their standard errors, are as follows: 

Up 0.27 ± 0.050 

, Down 0.62 ± 0.046 

Divergence 0.88 ± 0.036 

The difference between the upward and downward responses will be discussed in the 
next chapter. 

The causes of variation of the generation means will be considered more fully 
in the next chapter. Here we simply note what the causes are, and consider what 
can be done to reduce this variation in the response from generation to generation. 
The causes of the variation are: random genetic drift, sampling errors in estimating 
the generation means, differences in the selection differential, and environmental 
factors. Variation due to random drift and sampling errors can be reduced only by 
increasing the numbers selected and measured. Differences in the selection differential 
can be allowed for in a way to be explained in the next sections. Environmental 
differences between generations can arise from many causes, climatic, nutritional, 
and general management. The obvious way of eliminating environmental fluctua¬ 
tions from an assessment of the rate of response is by keeping an unselected control 
population. On the assumption that environmental differences affect the selected and 
control populations alike, the difference between them estimates the genetic improve¬ 
ment made by selection. The use of a control, however, does not always improve 
the precision with which the response is estimated. Both populations are subject to 
random drift and to sampling errors, and the sampling variance of the difference 
between the two lines is the sum of the sampling variances of each line separately. 
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Fig. 11.4. Two-way selection for 6-week weight in mice. On the left the responses of the two lines 
are shown separately. On the right the ‘divergence’ is shown, i.e., the difference between the 
upward- and downward-selected lines. See Example 11.4. (Based on Falconer, 1953.) 
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Furthermore, the scale of an experiment is usually limited by the facilities available, 
so that the use of a control necessitates a reduced population size of the selected 
line. If the selected line and the control both have half the population size of a single 
selected line, then the use of a control quadruples the sampling variance of the 
response measured as deviations from control, and so doubles the standard error. 
This loss of accuracy may counterbalance the gain from eliminating environmental 
differences. The relative accuracy of the response measured by use of a control can 
be improved if the ‘control’ is not an unselected population but is selected in the 
opposite direction. This is known as ‘two-way’, or ‘divergent’, selection and is 
illustrated in Fig. 11.4. Each selected line acts as a ‘control’ for the other and the 
response is measured as the divergence between the two lines. The elimination of 
some of the variation from generation to generation by this means is clearly seen 
in Fig. 11.4. In the absence of environmental differences between generations, the 
precision is the same, relative to the magnitude of the response, as that of a single 
line occupying the same total facilities. The reason for this is that the standard error 
of the difference between the lines is doubled, as explained above, but the response 
is also approximately doubled because both lines are selected. An unselected con¬ 
trol, however, is preferable to two-way selection if, for practical reasons, one is 
interested only in the change in one direction, because the response is not always 
equal in the two directions, a point that will be discussed in the next chapter. 

Random changes of environment reduce the precision with which the response 
is estimated, but they do not bias the estimate. A more serious difficulty arises from 
environmental trends, i.e., progressive changes with time, because what looks like 
a response to selection may really be due to an environmental trend. This makes 
it difficult to assess the effectiveness of selection in the improvement of domesticated 
animals and plants, because without a control there is no sure way of deciding how 
much of the improvement is due to selection and how much to improved manage¬ 
ment. However, when generations overlap, as with farm animals, it is possible to 
assess the genetic improvement without an unselected control by making comparisons 
between contemporary individuals belonging to different generations (see Smith, 
1962). For detailed consideration of the measurement of responses and the use of 
controls, see Hill (1972#) and Muir (1986). Experimental evidence about the 
usefulness of controls is reviewed by Hill (1972b). 

Weighting the selection differential 

In experimental selection the selection differential as well as the response has to 
be measured because it is the relationship between the two, and not the response 
alone, that is of interest from the genetic point of view. We have to distinguish be¬ 
tween the expected and the effective selection differential, because in practice the 
individual parents do not contribute equally to the offspring generation. Differences 
of fertility are always present, so that some parents contribute more offspring than 
others. To obtain a measure of the selection differential that is relevant to the response 
observed in the mean of the offspring generations, we therefore have to weight the 
deviations of the parents according to the number of their offspring that are measured. 
The expected selection differential is the simple mean phenotypic deviation of the 
parents as defined at the beginning of this chapter; the effective selection differen¬ 
tial is the weighted mean deviation of the parents, the weight given to each parent, 
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or pair of parents, being their proportionate contribution to the individuals that are 
measured in the next generation. 

The weighting of the selection differential takes account of a good part of the effects 
of natural selection. If the differences of fertility are related to the parents’ phenotypic 
values for the character being selected, then this natural selection will either help 
or hinder the artificial selection. If, for example, the more extreme phenotypes are 
less fertile, or more frequently sterile, then natural selection is working against 
artificial selection. By weighting the selection differential we measure the joint ef¬ 
fects of natural and artificial selection together. A comparison of the effective (i.e. 
weighted) with the expected selection differential may thus be used to discover 
whether natural selection is operative. 

Example 11.5 The experiment with mice, shown in Fig. 11.4, was carried through 
30 generations in the upward direction and 24 generations in the downward direction 
(see Falconer, 1955). Comparisons are made in the table between the effective (weighted) 
and the expected (unweighted) selection differentials in the two lines. The period of selec¬ 
tion is divided into two parts and the comparisons are made separately in each. Throughout 
the whole of the upward selection there was virtually no difference between the effec¬ 
tive and expected selection differentials, and we can conclude that natural selection was 
unimportant as a factor influencing the response. The situation in the downward selected 
line, however, is different, the effective selection differential being less than the expected, 
especially in the second part. From this we can conclude that natural selection was 
operating in favour of large size, thus hindering the artificial selection and reducing the 
response obtained, particularly in the latter part of the experiment. The main way in 
which natural selection acted in the small line was through fertility. The smaller mice 
tended to have fewer young in their litters and to be more often sterile, with the result 
that the smallest of the selected parents were represented by fewer measured offspring. 
There was also another way in which the effective selection differential was reduced 
which was formally equivalent to natural selection. Smaller mice tend to be more reac¬ 
tive and ‘jumpy’ than larger mice, and very small mice often escape and are lost during 
the changing of cages. Those lost in this way before they reproduced proved to be the 
smallest of the selected mice. 


Direction of 
selection 

Generation 

numbers 

Selection differential per 
generation (g). 


Expected 

Effective 

Effective 

Expected 

Upwards 

1-22 

1.39 

1.36 

0.98 


23-30 

1.08 

1.09 

1.01 

Downwards 

1-18 

1.03 

0.96 

0.93 


19-24 

0.82 

0.70 

0.86 


The weighting of the selection differential does not take account of the whole effect 
of natural selection, because it makes no allowance for any differences of viability 
among the offspring that may be related to their phenotypic values. 


Realized heritability 

The response per generation, such as is illustrated in Fig. 11.4, describes what hap¬ 
pened, but it takes no account of the amount of selection applied. A means is therefore 
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needed of showing how the response is related to the selection differential. This 
is done by expressing the response as a proportion of the selection differential, i.e., 
the ratio R/S, in the following way. The generation means are plotted against the 
cumulated selection differential. That is to say, the selection differentials, appro¬ 
priately weighted, are summed over successive generations so as to give the total 
selection applied up to the generation in question. Responses plotted in this way 
are illustrated in Fig. 11.5. The average value of the ratio R/S is then given by the 
slope of the regression line fitted to the points. 

The response to selection can be used as a means of estimating the heritability in 
the base population, because the expected value of the ratio R/S is the heritability, 
rearrangement of equation [ 11 . 2 ] giving 


n = 7 ... [11.7] 

The heritability estimated in this way is called the realized heritability because it 
is primarily a description of the response and may, for several reasons, not provide 
a valid estimate of the heritability in the base population. First, for reasons to be 
explained at the end of this chapter, the responses of characters with high heritabilities 
are expected to be somewhat reduced after the first generation of selection, so that 
the realized heritability after the first generation will underestimate the heritability 
in the base population. Second, if there are systematic changes due to environmental 
trends or inbreeding depression, these will be included in the response unless they 
are removed by comparison with a control line. Changes due to random drift are 
also confounded with the response. The effects of random drift, which will be discuss¬ 
ed in the next chapter, can be assessed only by replication of the selection. Figure 
11.5 provides a clear example of realized heritabilities being disturbed by other fac¬ 
tors. Selection in the two directions yielded very different realized heritabilities; 



Fig. 11.5. Two-way selection for 6-week weight in mice. The generation means are plotted 
against the cumulated selection differentials, as explained in the text. The slopes of the 
regression lines fitted to the points measure the realized heritabilities, which were 0.175 for 
upward selection and 0.518 for downward selection. (After Falconer, 1954.) 
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is a valid description of the response, but they cannot both be valid estimates 
W the heritability in the base population. The reasons for responses being different 
ift the two directions will also be discussed in the next chapter. 

Change of gene frequency under artificial selection 

ft was pointed out at the beginning of this chapter that the change of the population 
mean resulting from selection is brought about through changes of the gene frequen¬ 
cies at the loci that influence the character selected. But since the effects of the loci 
cannot be individually identified, the changes of gene frequency cannot be followed 
in practice. It is possible, however, to deduce an approximate expression connect¬ 
ing the intensity of selection, i, with the coefficient of selection, s, acting on individual 
loci. The approximate change of gene frequency can then be found by substituting 
the appropriate value of s in the formulae given in Chapter 2. 

The effect of selection for a metric character on one of the loci concerned may 
be pictured in the manner illustrated in Fig. 11.6. This refers to a locus with two 
alleles and shows only the two homozygous genotypes, AjA, and A 2 A 2 . The posi¬ 
tion of the heterozygous genotype will be considered later. The two genotypes are 
shown as having equal frequencies, but this is not necessary to the argument. The 
homozygous genotypes differ in their mean phenotypic values by 2a units of measure¬ 
ment in the notation of earlier chapters (see Fig. 7.1). The distribution of phenotypic 
values within each genotype is depicted, this variation arising from other loci as 
well as from environmental causes. It is assumed that both distributions are normal 
and that the variance within each genotype is the same. The solid vertical line marks 
the point of truncation, to the right of which all individuals of both genotypes are 
selected. 

The coefficient of selection is deduced by finding the relative fitness of the two 
genotypes, i.e., the relative proportions that survive through being selected. Let p 
be the proportion of A,Ai that survive, shown by stippling in the figure. Now 
imagine the truncation point is moved down by 2 a units, as shown by the broken 



Fig. 11.6. Selection for a metric character operating on one of the loci concerned. 
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line, so that the same proportion p is cut off from the A 2 A 2 genotype by this new 
truncation. The proportion of the A 2 A 2 genotype that is actually selected is p minus 
the proportion lying between the two truncation lines. Provided the separation be¬ 
tween the two genotypes, i.e., 2a, is small relative to the standard deviation, the 
area of the A 2 A 2 curve between the two truncation lines is 2az, where z is the height 
of the ordinate. The proportion of the A 2 A 2 genotype corresponding to this area 
is 2 azJop, where op is the phenotypic standard deviation. From equation [ 11 .5], 
z ~ and so toe proportion of A 2 A 2 actually selected is p - 2 aip/a P = 
p{\ - 2ai/op). In Chapter 2 the coefficient of selection referred to the reduced 
fitness of the genotype selected against, which is here A 2 A 2 . The coefficient of 
selection against A 2 A 2 is obtained from its relative fitness as 

j _ s = fitness of A 2 A 2 = p (\ _ 2ai/a P ) 
fitness of AjAj p 

from which the coefficient of selection against A 2 A 2 becomes 


s 


i ^ a 


a P 


(approx.) 


... [ 11 . 8 ] 


This relationship is valid for any degree of dominance, though only two genotypes 
have been considered in its derivation. If Aj is completely dominant, the 
heterozygotes are included with the AjAj genotype, and if Aj is completely 
recessive they are included with A 2 A 2 , In either case the change of gene frequency 
is given approximately by equation [ 2 . 8 ], which is A q = -sq 2 ( 1 - q), where q is 
the frequency of A 2 . If there is no dominance, the coefficient of selection acting 
on the heterozygote is defined in Chapter 2 as j s. The difference in mean between 
AjA 2 and A,A, is a, and the relationship equivalent to equation [11.8] is js - 
ia/op, which is the same as equation [11.8]. The change of gene frequency can 
therefore be obtained by substituting s from equation [11.8] into equation [ 2 . 7 ], which 
is Aq = - t sq{ \ - q). The conditions under which equation [11.8] provides a 
reasonable approximation for s are examined by Latter (1965), and a more general 
derivation is given by Kimura and Crow (1978). 

Equation [11.8] shows how the two ways of expressing the ‘strength’ of selection 
— by the intensity and by the coefficient of selection — are related to each other. 
The change of mean of a quantitative character can be derived in the two correspond¬ 
ing ways, and the equivalence of the two derivations provides a check on equation 
[11.8]. The change of mean can be derived from 5 and A q as follows. Consider 
a single locus with no dominance. From equation [7.2], the mean with d = 0 is 
M = a(p - q) = a(\ - 2q), where q is the initial gene frequency. The change 
of mean is AM = a{ 1 - 2 q x ) - a( 1 - 2 q), where qi is the new gene frequency. 
This reduces to AM = -2aAq. Substituting for Aq from equation [2.7] gives AM 
= asq{ 1 - q). Equation [11.8] allows us to put s into terms of i and this gives 
AM = 2ia 2 q(\ - q)ia P . Now, when d = 0, 2a 2 q{\ - q) = V A , by equation [8.5], 
V A being the additive variance due to the locus under consideration. Summing over 
all loci gives the change of mean in terms of the total additive variance as AM = 
i V A /op = (iV A /Vp)o P = ih 2 o P , and this is the response to selection given by equa¬ 
tion [11.3]. Thus equation [11.8] provides a connection between the change of mean 
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derived from s and A q, and the response derived from i and h 2 . The same 
equivalence can be demonstrated for a dominant gene if the approximation of neglect¬ 
ing ( Aq ) 2 is made. 

The quantity 2a/a P in equation [11.8] is the difference of value between the two 
homozygotes, expressed in terms of the phenotypic standard deviation. This quantity 
will be referred to as the standardized effect of the locus. Equation [11.8] will be 
used in the next chapter to draw some tentative conclusions about the standardized 
effects of loci giving rise to selection responses. 


Effects of selection on variance 

It was stated earlier that selection of parents reduces the variance. The effects of 
this reduced variance on the expected response to selection will now be briefly 
described; a full explanation is given by Bulmer (1985, Ch. 9). The theory to be 
outlined ignores changes of variance resulting from changes of gene frequency. It 
is therefore strictly applicable only when the number of genes is large, and their 
individual effects on the character are consequently small, so that changes of gene 
frequency are negligible. This means in practice that it is not applicable over a long 
period of selection. 

A group of selected parents represents one tail of the phenotypic distribution, and 
in consequence their phenotypic variance must be less than that of the whole popula¬ 
tion from which they are selected. If V P is the phenotypic variance before selec¬ 
tion and k the factor by which it is reduced, then the phenotypic variance, Vp, in 
the selected parents is 

n = (i - k)v P 

The factor k depends on the intensity of selection. When selection is by truncation 
of a normal distribution, then 

k = i(i - x) ... [11.9] 

where i is the intensity of selection and x is the corresponding deviation of the point 
of truncation from the population mean. (Values of i and x are given in Appendix 
Table A.) The additive variance among the selected individuals can be deduced as 
follows. The correlation between breeding values. A, and phenotypic values, P is 
h (equation [10.3]). The square of the correlation is the proportion of the variance 
of A that is associated with variation of P. Thus only the proportion h 2 of the 
additive variance is affected by the reduced phenotypic variance in the selected 
individuals. V P is reduced by the factor k and the proportion h 2 of V A is reduced 
by the same factor. Thus the additive variance among the selected individuals is 

K = (1 - h 2 k)V A 

where V A is the additive variance in the population before selection. 

The reduction of the additive variance can be described in terms of gametic phase 
disequilibrium. This was mentioned in Chapter 8 as the consequence of one form 
of non-random mating, and is represented by the covariance term in equation [8.10]. 
The covariance due to disequilibrium here is -h 2 kV A , being negative, it reduces the 
variance. The reason why negative disequilibrium is generated can be understood 
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in the following way. Suppose we could choose a set of individuals all with the same 
genotypic value, and suppose we could record the magnitude of the effect on the 
character of each gene in every individual. We would then find that the gene-effects 
were negatively correlated in the individuals; in other words, there would be a 
negative covariance of gene-effects. (This is essentially the same phenomenon as the 
one explained at the end of Chapter 9 in the section on ‘Competition’.) Of course 
individuals selected by their phenotypic values do not have identical genotypic values, 
but their genotypic values are more alike than those of a randomly chosen set of 
individuals, and so a negative disequilibrium covariance is generated by the selection. 

Now consider the additive variance in the progeny of the selected individuals, 
when they are mated at random to produce full-sib families. The disequilibrium due 
to unlinked genes is halved in the progeny, as explained in Chapter 1, but it appears 
only in the between-family variance for the following reason. The variance within 
families due to unlinked genes is not affected by the phenotypic values of the parents 
and so is unaffected by the selection. It is therefore simply \ V A . The additive 
variance of the means of full-sib families is half the additive variance of the selected 
parents. It is therefore r (1 - h 2 k)V A , the disequilibrium covariance being h 2 kV A . 
Adding the between-family and within-family variances gives the total additive 
variance due to unlinked genes of (1 - j h 2 k)V A . The consequences of one genera¬ 
tion of selection outlined above are summarized in Table 11.1. 


Table 11.1 Variance components in selected parents and their progeny. V P and V A are 
the variances and h 2 the heritability in the parental generation before selection; 
k = i(i - x), equation [11.9]. Families are full sibs. 


Component 

Parents after 
selection 

Progeny (Generation 1) 

Phenotypic 

(1 -k)V P 


(1 - $i 4 k)V P 

Additive 

(1 ~h 2 k)V A 

Between families 

Kl - h 2 k)V A 



Within families 




Total 

(1 - ¥i 2 k)V A 

Disequilibrium 

-h 2 kV A 


i 2 kV A 


Dominance variance is ignored in the above treatment; but in any case, if present, 
it would be very little affected by disequilibrium because only one-quarter of it appears 
in the between-family component. The environmental variance is, of course, the 
same in the progeny as in the parents before selection. With V D and V E unchanged, 
or nearly so, the reduced additive variance results in a reduced heritability. If h\ 
and h 2 are the heritabilities in progeny and parental generations respectively, the 
ratio of the two can be shown from the quantities in Table 11.1 to be 


M. = 

h 2 


1-2 h 2 k 
1 - y h 4 k 


[ 11 . 10 ] 


The first generation of response to selection depends on the heritability in the paren¬ 
tal generation, and is predicted by equation [11.3]. The second generation of response 
is reduced because both the heritability and the phenotypic variance are reduced. 
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If R\ and R 2 are the first two generations of response the ratio of the two can be 
shown from equation [11.10] and Table 11.1 to be 


R 2 = 1 -\h 2 k 

R x J(1 -kh% 


[ 11 . 11 ] 


Some representative values of this ratio are given in Table 11.2. It will be seen 
that with high heritabilities and high intensities of selection the expected response 
is about 20 per cent less in the second generation than in the first. With low 
heritabilities, however, the reduction is much less. 


Table 11.2 The response to selection in the second generation relative to that in the 
first, calculated by equation [11.11], for some representative values of the heritability 
h 2 in the base population and of the proportions selected, p per cent. 


p% Heritability, h 2 



0.9 

0.7 

0.5 

0.3 

0.1 

5 

0.76 

0.79 

0.83 

0.89 

0.96 

20 

0.78 

0.81 

0.85 

0.90 

0.96 

50 

0.83 

0.85 

0.88 

0.92 

0.97 


Though half the disequilibrium (with unlinked genes) is lost by recombination in 
the progeny, more is regenerated by the next selection. The additive variance in 
any generation (t + 1) can be expressed in terms of that of the previous generation 
(0 by the recurrence equation 

VAt + i) = Hi - *co*JK«(o + 1 v* 

where V A is the additive variance in the base population (see Bulmer, 1985, eq. 
9.30). The reduction of additive variance, however, does not go on indefinitely. 
A balance is soon reached where the increase of disequilibrium due to selection is 
balanced by the loss due to recombination. In the absence of linkage three or four 
generations are sufficient to bring the population very near to this balance, though 
with linked genes it takes longer. After the state of balance is reached the response 
is expected to remain constant until gene-frequency changes become large enough 
to affect it. The constancy of response usually found in experiments is therefore 
not in conflict with the effect of selection on variance, at least after the first genera¬ 
tion. But the response observed after the first generation is expected to be somewhat 
less than would be predicted by equation [11.3] from the heritability in the base 
population. The discrepancy will be small when the heritability and selection inten¬ 
sity are low, and greater when they are higher. 

Example 11.6 An experiment with sheep described by Atkins and Thompson (1986) 
provides an excellent verification of the effects expected from gametic phase disequilibrium 
generated by selection. Two-way selection was made for the length of the cannon-bone 
(a bone in the foreleg) in a population of Scottish Blackface sheep, the length being adjusted 
for body weight. The selection was applied for about 8 generations in a period of 19 
years. The response was estimated from the divergence between the upward and downward 
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selected lines. There was an unselected control line from which the parameters of the 
base population were estimated. The heritability in the base population was 0.56 ±0.04. 
The realized heritability was 0.52, calculated by equation [11.7] after allowance for com¬ 
plications which need not concern us. As expected from the effects of gametic phase 
disequilibrium, the realized heritability was less than that of the base population. To 
test whether the observed response agreed with what would be expected when dise¬ 
quilibrium is allowed for, the authors calculated the base-population heritability that would 
be required to give the observed response when disequilibrium is taken into account. 
They found this to be 0.57, which agrees very well with the observed value of 0.56, 
showing that the reduction of the response due to disequilibrium was almost exactly as 
expected. 

The phenotypic variance was also reduced by the expected amount. It was reduced 
by 9 per cent in the high line and 11 per cent in the low, while the expected reduction 
was 10 per cent. 

Problems 

The effects of selection on variance, described at the end of the chapter, are to be 
ignored in all these problems except, as a refinement, in 11.3. 

11.1 What would be the expected rate of progress per generation if selection were 
applied to the characters in the table, individuals being selected on the basis of their 
own phenotypic merit? 





Phenotypic 

Proportion 



Heritability 

variance 

selected ( %) 

(1) 

Body weight of mice (g) 

0.37 

10.7 

(a) 25 





(b) 50 





(c) 75 

(2) 

Development time in 
Tribolium (days) 

0.18 

1.7 

10 

(3) 

Female fertility of mice 
(litter size) 

0.22 

4.3 

30 


[Solution 9] 

11.2 The species of Darwin’s Finch in Problem 10.8 suffered severe mortality as 
a result of a drought in 1977. This species eats seeds, and the seeds available during 
the drought were mainly large and hard ones. The surviving birds, in comparison 
with the population before the drought, were larger in several dimensions, particular 
of the bill. The mean depth of the bill in 642 birds before the drought was 9.42 mm 
and in 85 birds after the drought was 9.96 mm. What change in bill depth would 
you predict from this selective survival and the data in Problem 10.8? 

Data from Boag, P.T. & Grant, P.R. (1981) Science, 214, 82—4. 


[Solution 19] 

11.3 Suppose that selection for weight gain from 5 to 9 weeks in a flock of broiler 
chickens is planned. From the following data predict the mean weight gain after 
five generations of selection. In each generation 4 males and 8 females will be 
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selected, each out of 60 birds measured. Base population: mean = 738 g, standard 
deviation = 111 g, heritability (from sib analysis) = 0.81. 

Data based on Pym, R.A.E. & Nichols, P.J. (1979) Brit. Poult. Sci., 20, 73—86. 

[Solution 29] 


11.4 The data in the table refer to selection for increased and for decreased plasma 
cholesterol levels in mice. Calculate the realized heritability from the two lines 
separately and from the divergence. M = generation mean, P = mean of selected 
individuals to be used as parents of the next generation. The units are log (mg/100 ml). 
The sexes are averaged. 


Generation 

High 


Low 


M 

P 

M 

P 

0 

2.16 

2.32 

2.16 

2.02 

1 

2.26 

2.34 

2.06 

2.00 

2 

2.26 

2.37 

2.03 

1.97 

3 

2.33 

2.41 

2.02 

1.96 

4 

2.45 

2.47 

2.05 

2.01 

5 

2.44 

— 

2.01 

— 


Data from Weibust, R.S. (1973) Genetics, 73, 303—12. 


[Solution 39] 


11.5 The table gives the data on the selection of mice for small body size in the 
last generation of the experiment shown in Fig. 11.5. Calculate the unweighted and 
the weighted selection differentials. Treat female and male parents separately and 
then combine them. What conclusion can be drawn about natural selection? 


Weights (g) of parents 

Mating Number of offspring 


number 

Female 

Male 

measured 

1 

7.6 

12.4 

1 

2 

12.4 

14.5 

9 

3 

13.5 

11.6 

0 

4 

13.7 

11.6 

15 

5 

13.2 

14.3 

5 

6 

17.2 

17.1 

14 

7 

10.7 

13.8 

10 

8 

12.9 

11.6 

9 

9 

14.2 

10.1 

0 

10 

10.5 

13.1 

6 


Mean of population from which parents were selected: 
Females, 13.14; Males, 14.80 


[Solution 49] 
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11.6 Suppose that you are selecting sheep for growth rate in an experiment with 
the following procedure. Equal numbers of males and females are selected and mated 
in single pairs, the numbers selected being large. Both sexes breed first when they 
are 2 years old, and subsequently produce lambs once each year. The average number 
of lambs reared to maturity is 1.2 per ewe per breeding season. Equal numbers are 
selected each year. For how many years should you keep the parents before replac¬ 
ing them by selected offspring in order to maximize the rate of progress per year? 
What proportion of lambs will then be selected in each year? 

[Solution 59] 

11.7 The breeding scheme for sheep in Problem 11.6 is not a practically realistic 
one. To be more realistic, assume that instead of mating each male to only one female, 
males are mated each to 10 females. This affects the optimal procedure which is 
now to discard females when they are 4 years old, having had 3 breeding seasons, 
and to discard males when they are 3 years old, having had 2 breeding seasons. 
How much better would this be than the optimal procedure with single-pair matings? 
Why is the optimal age for discarding both sexes lower than in Problem 11.6? 

[Solution 111] 

11.8 If sows are to be selected for litter size, selection can be based on the size 
of their first litter or on the mean size of two or more litters. Increasing the number 
of litters has the advantages of increasing the intensity of selection and of increasing 
the heritability, but it has the disadvantage of increasing the generation length. What 
is the optimum number of litters for maximizing the expected response per year? 
Take the repeatability of litter size to be 0.409 from Problem 8.6. Assume that (i) 
sows have their first litters when they are 1 year old and subsequently have litters 
at 6-month intervals, (ii) the average number of individuals reared per litter is 8, 
(iii) the number selected is the number required to replace the parents. For simplicity, 
assume further that (iv) the number selected is large and (v) generations are non¬ 
overlapping, the offspring of selected individuals all having their first litters when 
the youngest is 1 year old. 


[Solution 121] 

11.9 Suppose that one of the genes affecting stemopleural bristle score in Drosophila 
is additive, that the two homozygotes differ by 0.3 bristles, and that the increasing 
allele is at a frequency of 0.4. What will be the frequency of this allele after the 
10 per cent highest-scoring flies have been selected in two successive generations 
in a population with a phenotypic standard deviation of 2.0 bristles? 


[Solution 131] 



SELECTION: 

II. The results of experiments 


In the last chapter we saw that the theoretical deductions about the effects of artificial 
selection are limited to the change of the population mean, and strictly speaking 
over only one generation. By changing the gene frequencies, selection changes the 
genetic properties of the population upon which the effects of further selection depend. 
And, because the effects of the individual loci are unknown, the changes of gene 
frequency cannot be predicted, and so the response to selection can be predicted 
only for as long as the genetic properties remain substantially unchanged. Thus there 
are many consequences of selection that can be discovered only by experiment. The 
object of this chapter is to describe briefly what seem to be the most general conclu¬ 
sions about these consequences that have emerged from experimental studies of 
selection. The most important questions to be answered by experiment concern the 
long-term effects of selection. For how long does the response continue? By how 
much can the population mean ultimately be changed? What is the genetic nature 
of the limit to further progress? Before dealing with the long-term effects, however, 
there are two questions to be considered concerning the earlier generations during 
which the rate of response remains more or less constant. These are the repeatabili¬ 
ty of responses and asymmetry of responses to selection in opposite directions. 

Short-term results 

Repeatability of response 

The variability of the responses from one generation to the next was commented 
on in the last chapter, and four causes were given: random drift due to the restricted 
number of parents, sampling error in estimating the generation mean, variation of 
selection differentials, and environmental factors. The question to be considered now 
is the variability of the overall response: if the experiment were repeated, how closely 
would the results agree? An answer to this question is needed before a standard error 
can be attached to the realized heritability as an estimate of the heritability in the 
base population, or to allow comparisons to be made between different experiments. 
We are concerned here only with the period of selection during which the response 
remains constant and a linear regression can be fitted to the generation means. The 
standard error of the slope of the fitted regression line can, of course, be calculated. 
But this does not tell us how much variation in slope there would be between 
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replicates, because it does not take account of the variation between replicates aris¬ 
ing from random drift. Random drift causes changes of gene frequencies which are 
reflected in changes of the generation mean. In consequence, replicate lines selected 
independently come to drift apart in the manner that was illustrated in Fig. 3.2. The 
changes due to drift are cumulative, any change in one generation being carried on 
as the starting point for the change in the next generation. Because of the cumulative 
nature of the changes due to drift, the deviations from regression of any one line 
do not include all the variation due to drift. For this reason the standard error of 
the fitted regression underestimates the variation between replicates. This point is 
illustrated in Fig. 12.1. In the experiment illustrated there were six replicates. Fig. 
12.1(a) shows the response of all the replicates together treated as if they were a 
single large population. Fig. 12.1 (b) shows the responses in three of the six replicates. 
Regression lines were fitted to each of the six replicates and their slopes are given 
in Table 12.1. The realized heritabilities can be estimated either from the single 
regression lines in Fig. 12.1(a) or from the means of the six replicates in Table 12.1; 
there is little difference between them. The standard errors, however, differ greatly. 



Cumulated selection differential (g) 

Fig. 12.1. Ten generations of two-way selection for six-week weight in mice (Falconer, 1973). 
Generation means, measured as deviations from controls, are plotted against cumulated 
selection differentials (original data). 

(a) The whole ‘population’ consisting of six replicates selected in each generation. The fitted 
regression lines have slopes ±s.e. of: Upward, b = 0.398 ± 0.020; Downward, 
b = 0.328 ± 0.014. 

(b) Three of the six replicates selected in each direction. Each replicate was bred from 8 pairs 
of parents with minimal inbreeding, giving N e = 31 (by equation [4.9]). The realized 
heritabilities of these and the other replicates are given in Table 12.1. 
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Table 12.1 Regression coefficients estimating realized heritability in replicate selec¬ 
tion lines. The replicates are listed in order of the magnitude of their responses; those 
shown in Fig. 12.1(i>) are marked*. 



Upward selection 

Downward selection 


0.457* 

0.501* 


0.448 

0.376 


0.438 

0.365 


0.390 

0.301 


0.385* 

0.288* 


0.251* 

0.159* 

Mean 

0.395 ± 0.031 

0.331 ± 0.046 

Difference 

0.064 ± 0.055 



The standard errors of the single regression lines are 0.020 upwards and 0.014 
downwards; the standard errors calculated from the actual variation between replicates 
are 0.031 upwards and 0.046 downwards. 

Sampling variance 

The sampling variance of a realized heritability is not a straightforward matter. The 
following simplified account will allow an approximate standard error to be obtained. 
For details see Hill (1971, 1972c, d, 1980). The important conclusion is that the 
standard error of the regression seriously underestimates the standard error of the 
realized heritability; the latter can be between three and five times the former under 
a wide range of circumstances. Consider first the response in a single selected line, 
measured as the mean of the last generation, the base population mean being assumed 
to be known without error. Environmental differences between generations will be 
assumed to be negligible. There are then two sources of error in the estimation of 
the generation mean: variance due to random drift, <j d , and variance due to 
measurement error, The first depends on the effective number of parents, N e , 
and the second on the number of individuals measured, M, from which the mean 
is estimated. The drift variance of a line bred without selection is approximately 
equal to 2 FV A for reasons to be explained in Chapter 15 (see Table 15.1). The 
inbreeding coefficient F is approximately equal to tAF, where t is the number of 
generations and A F is the rate of inbreeding which is equal to 1/2 N e (equation 
[4.1]). So a 2 d = tV A /N e , approximately. The drift variance of a selected line is not 
so simple but the same expression is likely to be a good approximation (Hill, 1980). 
The sampling variance from measurement error is simply V P /M, where M is the 
number of individuals measured. The sampling variance of the response is therefore 

or = °d + o e 

f th 2 1 \ 

= [nr + m) <approx - ) 

If the response is estimated as the difference between two means, as with two-way 
selection or a selected and a control line, the sampling variance of the response is 
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the sum of the variances of the two lines. If the numbers measured and bred from 
are the same in the two lines the sampling variance of the response is twice the above 
expression. If the realized heritability is obtained by dividing the response by the 
cumulated selection differential ES, then its standard error is o R /LS. 

Selection experiments often yield no response over the first one or two genera¬ 
tions, or even longer, but give a clear response later. The reason is usually that the 
numbers have been too small and random drift in the ‘wrong’ direction has nullified 
the response. So what scale of experiment is needed to be reasonably confident of 
obtaining a demonstrable response? What size of population is needed and for how 
many generations should the selection be continued? The solution is explained by 
Nicholas (1980); here we can only illustrate the conclusions by one example. We 
shall suppose that there is one selection line and one unselected control, with the 
same numbers measured and used as parents in each. We first need an objective: 
how small do we want the standard error to be? The objective we shall adopt is 
to achieve a response that is a certain multiple of its standard error, say five times. 
The expected response is 

R = tih 2 o P 

(equation [11.3] multiplied by t generations). The intensity of selection, i, is known 
in advance from the proportion to be selected, but the heritability has to be assumed 
or guessed at. If either the numbers of individuals or the number of generations is 
fixed in advance, it is possible to work out what is needed from the expression for 
a\ above, but this is complicated. A simple but inexact solution is as follows. Note 
that the drift variance increases with the number of generations but the measure¬ 
ment error does not. Therefore if the number of generations planned is not very 
small, say 10 or more, most of the sampling variance comes from drift and the 
measurement error can be neglected. We then obtain the simplified expression for 
the ratio of the expected response to its standard error 

R_ = ih J(tN e ) ( approx ) 

Or 

If we make the objective R/o r = c, then we obtain the solution 

2c 

* N e = (approx.) 

i h 

To see what this means let us suppose that 25 per cent are to be selected, making 
i = 1.271; that the heritability is expected to be 0.3; and that the objective is c = 
5. This means tN e = 103. Note that if N e = N (the number used as parents), 103 
is the total number of parents in each line in the whole experiment. So if we planned 
to select for 10 generations we would need at least 10 parents per generation in each 
line to achieve the objective. Actually we would need more because measurement 
error has been neglected. For more elaborate, and accurate, assessments of the 
requirements see Nicholas (1980). It makes no difference to the standard error if 
a fixed number of parents is bred as a single population or is divided into replicate 
lines; an advantage of having replicate lines, however, is that an empirical standard 
error can then be obtained (Hill, 1971). 
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Asymmetry of response 

The experiment illustrated in Figs. 11.4 and 11.5 showed different rates of response 
to selection in opposite directions. Selection for increased body weight was only 
one-third as effective as selection for decreased body weight, when compared by 
the realized heritabilities. Asymmetrical responses have been found in many two- 
way selection experiments, and indeed most experiments show asymmetry in some 
degree. Asymmetry of response has important practical consequences for the follow¬ 
ing reason. The prediction of a response is made from the heritability estimated in 
the base population. This can be presumed to predict the mean of the responses in 
the two directions, and if there is asymmetry the response in one direction will fall 
short of expectation. Thus if a character of economic importance is selected in one 
direction only, the response may disappoint the breeder by being less than was 
expected. It would be useful to be able to predict when asymmetry is likely to occur, 
and particularly its direction, but this can be done only to a very limited extent. 
The reason is that there are several possible causes of asymmetrical responses, and 
only a few of these can be revealed by observations made before selection has been 
applied. The main causes that may generate asymmetrical responses are as follows. 


1. Random drift. If there is only one selection line in each direction, asymmetry 
of response can easily result from random drift, as explained in the preceding sec¬ 
tion. In any particular case, therefore, the first question must be whether the asym¬ 
metry is real in the sense that the realized heritabilities in the two directions are 
significantly different. Without replication of the selection lines it is not easy to prove 
the reality of the asymmetry. In Fig. 11.5 there was no replication and the reality 
of the asymmetry is therefore doubtful. In Fig. 12.1(a) there was replication and 
the asymmetry was proved to be no more than was expected by chance. Asymmetry 
due to random drift cannot be predicted. 

2. Selection differential. The selection differential may differ between the upward 
and downward selected lines, for several reasons, (i) Natural selection may aid 
artificial selection in one direction or hinder it in the other, (ii) The fertility may 
change so that a higher intensity of selection is achieved in one direction than in 
the other, (iii) The variance may change as a result of the change of mean: the selec¬ 
tion differential will increase as the variance increases and decrease as it decreases. 
This is a ‘scale-effect’, to be discussed more fully in Chapter 17. Differences of 
the selection differential influence the response per generation, and the agreement 
between observed and predicted responses, but they affect the realized heritability 
only a little (Falconer, 1954). Therefore asymmetry of realized heritabilities cannot 
be attributed to any cause operating through the selection differential. 

3: Inbreeding depression. Most experiments on selection are made with popula¬ 
tions not very large in size, and there is usually therefore an appreciable amount 
of inbreeding during the progress of the selection. If the character selected is one 
subject to inbreeding depression, there will be a tendency for the mean to decline 
through inbreeding. This will reduce the rate of response in the upward direction 
and increase it in the downward direction, thus giving rise to asymmetry. An 
unselected control population, subject to the same inbreeding depression, will reveal 
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how much asymmetry can be attributed to this cause. Prior knowledge of the rate 
of inbreeding depression would allow the asymmetry of response to be predicted. 

4. Maternal effects. Characters complicated by a maternal effect may show an 
asymmetry of response associated with the maternal component of the character. 
The asymmetry in the experiment of Fig. 11.5 was of this sort. The character selected 
— 6 week weight of mice — may be divided into two components, weaning weight 
and post-weaning growth, the former being maternally determined. The weaning 

. weights increased hardly at all in the large line but decreased very much in the small 
lines. Thus the asymmetry of response was all in the maternal component of the 
character. To attribute asymmetry of response to a maternal effect, however, only 
transfers the problem from the character selected to another and does not explain 
the asymmetry. 

5. Genetic asymmetry. The additive genetic variance and the heritability depend 
on the gene frequencies. Additive genes contribute maximally to the heritability when 
the gene frequency is 0.5, and recessive genes when the recessive allele has a fre¬ 
quency of 0.75 (see Fig. 8.1). These will be called the ‘symmetrical’ gene frequen¬ 
cies. If all the genes affecting the character were at these symmetrical frequencies 
in the initial population, the realized heritabilities would gradually diminish as the 
gene frequencies became changed, but the diminution would be roughly equal in 
lines selected in opposite directions and there would be no asymmetry. Suppose, 
however, that the population starts with gene frequencies above or below these values. 
In one line the frequencies will then move away from the symmetrical values and 
the heritability will diminish. But in the line selected in the opposite direction the 
gene frequencies will move toward the symmetrical values and the heritability will 
increase. Thus asymmetry will develop as the gene frequencies become different 
in the up- and down-lines. The response observed depends on the combined effects 
of all the loci, and asymmetry is to be expected if the ‘average’ gene frequencies 
are different from the symmetrical values of 0.5 for additive genes and 0.75 for 
recessive genes, the ‘averages’ being weighted for the gene effects. Asymmetry of 
response from this cause, however, would not be expected to appear immediately 
in the first few generations because it depends on differentiation in gene frequen¬ 
cies. Furthermore, it would be associated with non-linear responses, because it 
depends on the response decelerating in one line and accelerating in the other. So 
it could not readily explain asymmetry of responses that are not detectably non-linear. 
Genetic asymmetry can be looked at in a different way as the relation of the starting 
point to the selection limits. The theoretical limits to selection are when all favourable 
alleles have been fixed. Asymmetry of response will result if the initial population 
is not mid-way between the two limits in phenotypic value, so that the selection 
response has further to go in one direction than in the other. If selection favours 
heterozygotes, the situation is a little different because the limit in one direction is 
not fixation but is the equilibrium gene frequency. Asymmetry will result if the initial 
population is not at the point, in respect of gene frequencies, where the additive 
variance is maximal. 

6. Genes with large effects. Asymmetry of response that appears immediately in 
the first generation can result from genetic asymmetry of genes with large effects. 



Short-term results 


215 


The reason why the asymmetry is immediate is that the first selection of parents 
produces a large change of gene frequency, equivalent to many generations of selec¬ 
tion on genes with small effects. The asymmetry results from the initial gene fre¬ 
quencies not being at the symmetrical points. If the first response is asymmetrical 
it follows that the regression of offspring on mid-parent values in the base popula¬ 
tion will be non-linear (Robertson, \911b). Asymmetrical responses of this sort should 
therefore be predictable. 

7. Scalar asymmetry . The genetic and environmental variation may be skewed 
to different degrees or in opposite directions. The genetic variation will then make 
up a larger proportion of the total at one end of the distribution than at the other. 
In consequence the offspring-parent regression in the base population will be non¬ 
linear and the response asymmetrical in the first generation. The situation envisaged 
is shown diagrammatically in Fig. 12.2(a), where the genetic and environmental 
variances are skewed in opposite directions. The difference in skewness may be a 
scale effect, as will be explained in Chapter 17, or it may be due to genotype- 
environment interaction in the following way. Individuals that experience a good 
environment may exhibit less genetic variation than those that experience a poor 
environment; this is illustrated in Fig. 12.2(6). Or, individuals with high genetic 
values may be more susceptible to environmental variation than individuals with 
low genetic values, as in Fig. 12.2(c). In either case, individuals with high values 
will exhibit a lower heritability than those with low values. The difference in skewness 
could equally well be the other way round from that shown in Fig. 12.2, in which 
case the upward heritability would be greater than the downward. This form of asym¬ 
metrical response should, again, be predictable from a non-linear offspring-parent 
regression in the base population. For details see Robertson (19776). 

8. Indirect selection. Sometimes the criterion of selection is not quite the same 
as the character measured for assessing the response. Then, if the measured character 
is not linearly related to the selection criterion, asymmetry of response may result. 
Baptist and Robertson (1976) selected Drosophila for body size by inducing them 
to crawl as far as they could through a series of slits of decreasing diameter. The 
criterion of selection was the number of slits traversed, and this responded sym¬ 
metrically. The procedure, however, selected not only for body size but also for 
activity. Small flies are more active than large and this led to a non-linear relation 
between body size and slit score. In consequence, body size responded less to upward 
selection than to downward. 

With all these possible causes it is not surprising that asymmetrical responses are 
often found. Nor is it surprising that the cause operating in a particular case is hard 
to identify. Some of the causes make prediction of asymmetry possible but others 
do not and, until asymmetrical responses are better understood, the prediction of 
rates of response from the heritability in the base population will remain somewhat 
unreliable. There is, however, one generalization that might be made, or at least 
suggested. It is that if the character selected is a component of natural fitness, asym¬ 
metry should be expected, with selection towards increased fitness giving a slower 
response than selection towards decreased fitness. The reasons are, first, that these 
characters usually show inbreeding depression, which is itself a cause of asymmetry, 
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Fig. 12.2. Frequency distributions illustrating scalar asymmetry, (a) shows the additive genetic 
variation (solid line) with negative skewness, and the environmental variation (broken line) with 
positive skewness. ( b ) shows the additive genetic variation among individuals whose en¬ 
vironmental values are at the points marked by the arrows, (c) shows the environmental 
variation among individuals whose breeding values are at the points marked by the arrows. In 
both situations upward selection will give a lower realized heritability than downward selection. 
The figure is drawn roughly to scale, and if the distributions in (Jo) and (c) represented selected 
individuals the upward heritability would be about 0.1 and the downward about 0.4. 


and, second, that if the character has been subject to natural selection the gene fre¬ 
quencies are likely to be above the symmetrical point, i.e., nearer the upper limit, 
thus giving rise to genetic asymmetry. 

Long-term results 

The outcome of selection over a long period is unpredictable, at least with our pre¬ 
sent understanding. There are two reasons for this: first, the outcome depends on 
the properties of the individual genes contributing to the response, which cannot 
be determined by observation at the outset; and, second, because mutation produces 
new variation whose nature we cannot predict. There is, however, a body of theory 
that allows us to predict what would happen if certain conditions were present, and 
consequently to say what were the probable reasons for what actually happened. 
This theory will now be explained in outline. We shall deal first with responses in 
the absence of mutation, and then we shall consider the effects of mutation. Some 
features commonly found in experiments are shown in Fig. 12.3. These will be com¬ 
mented on when they bear on the theory. 
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Selection limits 

Without the creation of new variation by mutation, the response to selection cannot 
be expected to continue indefinitely. Sooner or later the genes segregating in the 
base population will be brought to fixation (or equilibrium if there is overdominance) 
by the selection or the accompanying inbreeding. The response will therefore slowly 
diminish and finally cease. The population is then at a ‘plateau’ or selection limit. 
Examples of this type of long-term result are shown in Fig. 12.3(b) and (c). The 
questions to be asked about selection limits are: how large is the total response in 
relation to the initial variation, and how long does it take to get to the limit? First, 
some empirical answers from experiments. We consider only two-way selection 
because of the complications of asymmetry. The total response, Rj, is the difference 
between the divergent lines when both are at their limits; it will be referred to as 
the range. The number of generations taken to get to a limit is not easy to decide 
because the response gradually decreases as the limit is approached. The time-scale 
can, however, be more precisely expressed as the half-life of the response. This 
is the number of generations taken to go half-way to the limit. The results of four 
experiments are summarized in Table 12.2; two refer to Drosophila and two to mice. 
The ratios of the two means at the limits varied widely. The litter size of the high 
line of mice was only 1.6 times that of the low line (column 4), but the high line 
of Drosophila had 8 times as many bristles as the low (column 1). The range varied 
from 3.6 to 28 additive genetic standard deviations, or from 1.7 to 20 phenotypic 
standard deviations. All the experiments took roughly 20—30 generations to reach 
the limits, and the half-lives varied from 7 to 12 generations. 

In these and most other experiments, selection has taken the mean far beyond the 


Table 12.2 Limits to selection in four experiments. Explanation in text. 




Experiment 

(1) (2) 

(3) 

(4) 

Observed ratio of means 

H/L 

8.0 

1.3 

2.5 

1.6 

Observed range 


28 

20 

16 

3.6 


Rj/Op 

20 

12 

8 

1.7 

Theoretical maximum 


82 

52 

22 

8 

Effective population size 


28 

28 

15 

32 

Duration (generations): 

Total (approx.) 

30 

20 

25 

20 


Half-life 

10 

7 

9 

12 


Half-life/^, 

0.4 

0.3 

0.6 

0.4 

Observed/Maximum: 

Range 

0.24 

0.22 

0.36 

0.21 


Half-life 

0.29 

0.19 

0.43 

0.36 

n = R 2 t /%o 2 a 


98 

50 

32 

2 

Standardized effect: 

2 atop 

0.21 

0.23 

0.24 

0.95 


Experiments and sources 

(1) Drosophila, abdominal bristles: Clayton and Robertson (1957). 

(2) Drosophila, thorax length (Fig. 12.3c): F.W. Robertson (1955). 

(3) Mouse, 6-week weight (Fig. 11.5): Falconer (1955), Roberts (1966a). 

(4) Mouse, litter size: Falconer (1965Z>, 1971, 1977). 

Population size in the Drosophila experiments is taken to be N e = 0.7 N (see Ch. 4). 
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range of variation in the base population. Though the total responses in Table 12.2 
may be impressive when reckoned in terms of the variation present in the original 
population, they are not at all spectacular when compared with the achievements 
of the breeders of domestic animals. For example, after selection to the two limits, 
the large mice were 2.5 times the weight of the small mice. In contrast, the weights 
of the largest breeds of dogs are about 100 times the weights of the smallest breeds 
(Stockard, 1941). The reason for the disappointing results of experimental selection 




Generations 


40 

30 

24 

46 

40 

30 

20 

10 


Fig. 12.3 Four experiments illustrating long-term responses. 

(a) Two-way selection for oil-content of maize seeds. Broken lines are reversed selection. 
{After Dudley, 1977.) 

{b) Six replicate lines of Drosophila melanogaster selected upwards for abdominal bristle 
number. Selection was suspended at the points marked. {After Yoo, 1980a.) 
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when viewed against the differences between the breeds of domestic animals is that 
experiments are carried out with closed populations, usually of not very large size. 
The breeder of domestic animals in contrast, by intermittent crossing, casts his net 
far wider in the search for genes favourable to his purpose and can utilize new muta¬ 
tions that have occurred elsewhere, some of which may have large effects. 

What, then, can theory tell us about long-term selection? 




Generations 


(c) Drosophila melanogaster , thorax length. (After F. W. Robertson, 1955.) 

(d) Mouse, six-week body weight. ( Adapted from Roberts, 1966b.) 

Dashed lines are responses to selection in the reverse direction; dotted lines are responses to 
natural selection, with artificial selection suspended. 

(All figures redrawn from the above sources with permission of the authors and publishers.) 
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Theory of limits. The total response relative to the initial genetic variation, Rj/cf a , 
depends primarily on the number of loci contributing to the variation. If, for exam¬ 
ple, there were only one additive locus with gene frequencies of 0.5, the most extreme 
genotypes would each appear in the base population with frequencies of 1/4; or if 
there were two such loci, with frequencies of 1/16. The selection limits, which are 
represented by the most extreme genotypes, would then be well within the range 
of variation found in the base population. With larger numbers of loci, the extreme 
genotypes are rarer in the base population and the selection limits are further removed, 
ift o A units, from the original mean. It is very clear that in at least three of the 
experiments in Table 12.2 there are many more than ‘a few’ loci contributing to 
the variation in the base populations. The numbers of genes and their effects are 
inversely related because, with a given amount of genetic variation, if there are few 
genes they must have large effects and if there are many genes they must have small 
effects. Since neither the number of genes nor their effects are known in advance 
it is not possible to predict the limits. It is possible, however, to predict a ‘theoretical 
maximum’ for the limit, so let us pursue the theoretical consideration of the limits 
a little further, to find out first how the limit depends on the number of loci and 
then to see what the theoretical maximum can tell us. 

First suppose that the population being selected is bred from a very large number 
of parents, so that no random fixation occurs; and suppose also that there are no 
overdominant loci. Then the favourable alleles at all relevant loci will be made 
homozygous at the limits. In the notation of Chapter 7, the range is £2 a units of 
measurement, i.e., the homozygote difference summed over all loci. How then does 
the range relate to the original additive genetic variance? In order to express this 
relationship in a simple way we have to make two assumptions that are certainly 
not true, but we can see later how the error may affect the conclusions. The first 
assumption is that all the loci have the same magnitude of effect on the character 
selected. The range is then R T = 2na, where n is the number of loci, each having 
a homozygote difference of 2a. The second assumption is that all the genes start 
at frequencies of 0.5. With these assumptions the original additive variance, by equa¬ 
tion [8.7], is a A = 2 = 2 na 2 . (The symbol a A is used here rather than V A 

because it simplifies the formulation when the standard deviation o A is involved.) 
Note that when gene frequencies are 0.5 the dominance deviation d does not appear 
in the formulation of a A , so no assumption about the degree of dominance needs 
to be made. The relationship between the range and the additive variance is obtained 
by squaring the range, which gives 



This equation will be considered later as a possible way of estimating the number 
of loci. The responses to be expected with other assumptions about the distributions 
of gene effects and gene frequencies are examined by Hill and Rasbash (1986). 

The total response considered above depends on the effective population size being 
very large. In practice the number of parents used is seldom large enough for ran¬ 
dom drift to be ignored. Some inbreeding therefore occurs, which leads to random 
fixation at some loci. In other words, unfavourable alleles are fixed at some loci 
despite the selection against them, with the result that the total response is less than 
that indicated by equation [12.1]. The limit achieved then depends on the chance 
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of fixation of the favourable allele at each locus, this chance being determined partly 
by the selection and partly by the inbreeding. The way in which the inbreeding and 
selection interact has been worked out by Robertson (1960), Hill and Robertson 
(1966), and Robertson (1970). The main conclusions are as follows. 

The number of loci affects the issue through the coefficient of selection s acting 
on each locus. The larger the number of loci, the smaller are their effects and the 
smaller the coefficient of selection. The chance of fixation of a favourable allele 
depends on its initial gene frequency; the rarer it is the more likely it is to be lost. 
Given the initial gene frequency, the chance of fixation is a function of N e s, which 
is the product of the effective population size and the coefficient of selection in favour 
of the allele. The coefficient of selection is equal to i(2a/o P ), where i is the intensity 
of selection (equation [11.8]). Therefore with a given initial frequency and a given 
gene-effect (2a/a P ) the chance of fixation of a favourable allele is a function of N e i, 
the product of the effective population size and the intensity of selection. Thus the 
total response should be greater with larger population sizes and with more intense 
selection. This expectaton has been confirmed by an experiment with Drosophila 
(Jones, Frankham, and Barker, 1968). Selection was carried out for 50 generations 
for increased abdominal bristle number in a number of lines with different inten¬ 
sities of selection and different population sizes. The lines with the greatest total 
responses were those with the largest population size and the greatest intensity of 
selection. In practice, increasing the intensity of selection usually necessitates a reduc¬ 
tion in the number of parents, which is the population size. The best compromise 
is to select 50 per cent. This maximizes the total response, though the rate of pro¬ 
gress will be less than it could be with more intense selection (Robertson, 1960; 
Jodar and Lopez-Fanjul, 1977). 

When the population size is not large, and there is consequently some inbreeding, 
it is still true that the larger the number of loci the larger the response will be in 
relation to the original variance. If the number of loci is very large their effects must 
be very small and most loci become fixed by random drift before selection can fix 
the favourable alleles. In spite of this, the greatest response would be attained if 
the genetic variance were caused by a very large number of loci (strictly speaking 
an infinite number), even though only a small proportion of them are fixed by the 
selection. This is the theoretical maximum response; it is the total response that would 
be attained if the genetic variance were generated by an infinite number of loci. 
With additive genes, the theoretical maximum response is shown to be (Robertson 
1960) 


^(max) 2N e ih o P . . . [12.2] 

All the terms in this expression can be estimated, so the maximum response can 
be predicted. Note that ih 2 o P is the predicted response in one generation (equation 
[11.3]) or the observed response over the first few generations. The maximum 
response to divergent selection is obtained simply by putting i as the sum of the 
intensities of selection in the two directions. With recessive genes, however, the 
maximum response may be much greater, particularly if favourable recessives have 
low initial frequencies. The theoretical maximum response has a half-life of 1 AN e 
generations if all the genes are additive, or up to about 2N e generations if the genes 
are recessive. 
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The theoretical maximum response is not something that can be achieved by optimal 
selection procedures. The response that can be achieved depends on the number of 
genes. The theoretical maximum is what would be achieved in the most favourable 
genetic situation, which is a very large number of genes — in fact an impossibly 
large number. As a prediction, therefore, the theoretical maximum does no more 
than set an upper limit to what can be expected. There is, however, some interest 
in comparing observed responses with their theoretical maxima because the com¬ 
parison gives some idea of how far the assumptions underlying the theoretical maxi¬ 
mum are valid. If the observed response falls far short of the maximum it cannot 
have been produced by a very large number of loci of equal effects and at equal 
gene frequencies. Ratios of observed to maximum responses are given in Table 12.2 
in respect of the range and of the half-life. The ratios are between about 0.2 and 
0.4, which shows that none of these responses were due to a very large number 
of genes having equal effects. 

The extent to which fixation has been produced by inbreeding rather than selec¬ 
tion has a bearing on the differences between replicate selection lines at the limits. 
If there has been much fixation by inbreeding, different replicates will have dif¬ 
ferent alleles fixed at many loci. The replicates will then reach different limits and, 
furthermore, if they are crossed, some genetic variance will be restored on which 
further selection could act. If there has not been much fixation by inbreeding, selec¬ 
tion will have fixed the same alleles at most loci. All replicates will then have the 
same limit and no genetic variation will be generated by crossing them. 

Phenotypic variance. The loss of genetic variance expected by the theory outlined 
above should lead to a reduced phenotypic variance. The phenotypic variance, 
however, is seldom found to decline as expected; often it increases. An example 
from Drosophila is provided by the experiment on abdominal bristle number in Table 
12.2. The phenotypic variance in the base population and in the most extreme of 
the replicate high and low lines is illustrated by frequency distributions in Fig. 12.4. 



Fig. 12.4. Frequency distributions of abdominal bristle number in Drosophila melanogaster 
(females), in the base population and in the most extreme high and low lines after 35 and 34 
generations of selection. (After Clayton and Robertson, 1957.) 
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Selection in both directions resulted in a much increased variance. Similar increases 
have been found in many other experiments. The following are some possible reasons 
for the phenotypic variance increasing or not decreasing. First, the variance of many 
characters is not independent of the mean: when the mean changes under selection, 
the variance automatically changes with it. This is a ‘scale effect’ which will be 
more fully discussed in Chapter 17. Second, the environmental variance may increase. 
With the approach to fixation the frequency of homozygotes will increase. There 
is evidence, mentioned in Chapter 8 and to be discussed more fully in Chapter 15, 
that homozygotes are sometimes more variable from environmental causes than are 
heterozygotes. An increase of phenotypic variance from both these causes might 
counterbalance a declining heritability and so, as can be seen from equation [11.3], 
maintain a more or less constant rate of response. And, third, the genetic variance 
may not decline as expected, for reasons to be considered later. 

Mutation 

The theory of how mutation affects responses to selection has been developed by 
Hill (1982 a, b ) and Hill and Keightley (1988); only an outline of the conclusions 
can be given here. New genetic variation is continually produced by mutation, but 
each new mutant has only a very small effect in the next few generations after its 
occurrence. This is because a newly mutated gene has a very low frequency, equal 
to ±N (where N is the population size). At such low frequencies the newly mutated 
genes contribute very little to the genetic variance and without selection most new 
mutants are lost by random drift. Selection, however, can increase the frequency 
of favourable mutants so that they are not lost. As their frequencies are further 
increased by continued selection, the variance they produce increases and they con¬ 
tribute more to the response. New mutations are introduced at every generation 
so that the response per generation attributable to mutation gradually builds up over 
time and eventually reaches a stable value which is not negligible. When this stable 
value is reached the increase of variance due to the continuous occurrence of new 
mutations is balanced by the loss of variance due to mutant genes being fixed by 
the selection or inbreeding. It may take 20 generations before mutation begins to 
contribute appreciably to the response, and much longer before the rate of response 
becomes constant. Thus mutation is important only for long-term responses. This 
has a practical implication: attempts to improve the rate of response by the artificial 
induction of mutation cannot be expected to show any success unless both the 
mutagenesis and the selection are continued for very many generations (Hill, 1982 b). 

The rate of response, when it has become constant, depends on the effective popula¬ 
tion size, N e , the intensity of selection, i, the phenotypic standard deviation, o P , 
and the additive variance arising from new mutants in each generation, V m . If some 
reasonable assumptions are made about the way mutants vary in their effects (the 
distribution of gene effects), and if the genes are additive (no dominance) and do 
not affect natural fitness, then the response per generation is expected to be 

R — 2NJV m /o P 

(This is analogous to equation [11.3] which can be written as R = iV A /o P .) Thus 
the rate of response is greater in a large population than in a small one, but it takes 
longer to reach the maximum rate because in a large population each mutant gene 
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starts at a lower frequency. The mutational variance, V m , can only be determined 
by experiment. Several experiments with Drosophila show that for both abdominal 
and sternopleural bristle numbers it is about V E x 10" 3 (see Hill, 19826; Lynch, 
1988); V E is the environmental variance, which provides a convenient standard. 
This makes it possible to test the expectation against the observed responses in long¬ 
term selection of Drosophila. For the experiment illustrated in Fig. 12.3(6) the 
predicted rate of response from mutation was 0.3 bristle per generation and the 
observed rate between generations 50 and 80 was 0.4 bristle per generation (Hill, 
19826). Thus the theory of how mutation contributes to long-term responses seems 
to be well supported by this experiment. 

The general picture of what ought to happen under long-term selection when muta¬ 
tion is taken into account is this. At first the response comes from the existing additive 
genetic variance in the base population. The rate of response should diminish gradually 
as the additive variance is depleted. Then, after perhaps 20 generations, new variance 
from mutated genes begins to contribute to the response, which should diminish more 
slowly. After some further time all the genes segregating in the base population should 
have been brought to fixation by the selection or the accompanying inbreeding, and 
further response depends entirely on the mutations that have accumulated during 
the selection. The response should then continue at a constant rate indefinitely: a 
selection limit is not expected when there is mutation. A few experiments have shown 
responses continuing over very many generations without any sign of approaching 
a limit. One of these is illustrated in Fig. 12.3 {a) in which maize was selected for 
oil content (Dudley, 1977). The high line showed no sign of reaching a limit up 
to generation 76. The same was true of a line selected for high protein content 
(Dudley, 1977). Another example is the selection for high pupa weight in Tribolium, 
which continued responding for at least 75 generations (Enfield, 1977). And one 
of the six replicate lines in Fig. 12.3(6) was still responding at generation 88. Most 
experiments, however, have shown responses that end at a limit. Furthermore, most 
populations at a selection limit do not lack genetic variance though the response has 
ceased. We must therefore consider the possible reasons for these facts. 

Causes of selection limits 

We have to distinguish two situations: limits where no genetic variance remains, 
and limits where genetic variance is present but the population fails to respond. In 
the first situation the problem is why there is no response from mutation. The reason 
may be simply that the new variation generated by mutation is not enough to pro¬ 
duce a detectable response; or, perhaps, that most mutants have an adverse effect 
on natural fitness. The second situation, where genetic variance is present, is a com¬ 
mon one; why, then, does the population not respond? 

The presence of genetic variance can be detected by reversing the direction of 
selection, or by suspending selection so that natural selection alone operates, or by 
inbreeding. The first two tests were applied to the selected lines in Fig. 12.3(c) and 
(d). The low line in Fig. 12.3(c) failed to respond to either, suggesting that no genetic 
variance remained. The same was true of the high line in Fig. 12.3 id), but here a 
renewed response occurred later in the upward-selected line which led to a new and 
higher limit. This was probably due to a new mutation. The high line in Fig. 12.3(c) 
and the low line in Fig. 12.3 (d), however, responded to reversed selection, show¬ 
ing that additive genetic variance was present. 
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The following are some possible reasons for failure to respond when genetic 
Variance is present. 

1. The limit may be an extrinsic one imposed by the nature of the character or 
the way in which it is measured. For example, percentages have limits of 0 and 
100; Drosophila bristle number cannot go below 0. These limits may be reached, 
or closely approached, with little fixaton of the genes concerned. The experiment 
of Fig. 12.3(a) shows this type of limit very clearly. The low line was near to a 
limit of zero oil content, while the high line, with no such extrinsic limit, continued 
to respond. 

2. Fertility is often reduced in selected lines, and the selection differential may 
be drastically reduced in the later generations. A plot of means against generations 
may then show a clear approach to a limit, but a plot against the cumulated selection 
differential may reveal little evidence that the line has really ceased to respond. Figure 
11.5 illustrates this, more particularly in the low line. In this situation the line will 
respond to reversed selection, though it may take some time to do so. 

3. Favourable alleles may be dominant. When the unfavourable recessives are 
brought to low frequencies most of the variance they cause is non-additive (see Fig. 
8.1). In this situation the mean will change in the unfavourable direction on 
inbreeding, for reasons to be explained in Chapter 14. The presence of rare 
unfavourable recessives was thought to be the situation in two experiments on mouse 
litter size, the one of Table 12.2 (Falconer, 1971) and one described by Eklund and 
Bradford (1977). In both cases inbreeding and crossing, which was thought to have 
eliminated the unwanted recessives, produced a mean well above the original limit. 

4. There could be overdominance at some loci for the character selected. At the 
selection limit overdominant genes would be in equilibrium at more or less 
intermediate frequencies. The variance they give rise to would be non-additive only. 
There would be an immediate change of mean towards the base level on inbreeding. 
There is, however, no evidence that this is a common feature of selection limits. 

5. The artificial selection may be opposed by natural selection. This should be 
detectable by the effective (weighted) selection differential being less than the expected 
(unweighted). This loss of selection differential was illustrated in Example 11.5. 
Natural selection will also be detected by suspending the artificial selection. Selec¬ 
tion was suspended, or ‘relaxed’, in all the lines in Fig. 12.3(c) and (d). With the 
possible exception of the high line in Fig. 12.3(c), none responded, suggesting that 
natural selection was not a cause of the limits. 

6. Natural selection may favour heterozygotes through the joint action of artificial 
and natural selection. This situation occurs commonly, in contrast to simple over¬ 
dominance for the character selected. The pygmy gene of mice, referred to in Chapter 
7, is an example. The gene arose by mutation in a line selected for small size (Mac- 
Arthur, 1949) and heterozygotes were selected because they are smaller in size. 
Homogygotes are smaller still but natural selection prevented the fixation of the gene 
because homozygotes are sterile. Thus, under combined artificial and natural selec¬ 
tion, heterozygotes were favoured. When the selection limit is reached under this 
situation, there is genetic variation due to the gene but no further response. If artificial 
selection is relaxed, the line responds to the natural selection. If selection is reversed, 
the artificial and natural selection both act in the same direction and there is an im¬ 
mediate and often rapid response. This may be regarded as an extreme form of asym¬ 
metrical response. 
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Natural selection acting through lethality rather than sterility has often been found 
in selected Drosophila lines. In the line selected for high bristle number illustrated 
in Fig. 12.4, a gene was present which was lethal in homozygotes and which increased 
bristle number by 22 in the heterozygotes, which was 5.8 times the original phenotypic 
standard deviation. Five of the six lines in Fig. 12.3(6) were found to contain lethals 
with large effects on bristle number, of up to nearly 5o P (Yoo, 19806). These lethals 
accounted for the reduction of mean on suspension of selection. They arose by muta¬ 
tion during the selection and they accounted for the accelerated responses seen par¬ 
ticularly clearly in one of the lines. The nature of the genetic variation in these lines 
at the end of the experiment was further analysed by Yoo (1980c). Heritabilities 
estimated from resemblances between relatives were found to be as high in some 
of the lines as in the base population. 

Relevance to practical breeding. It may be thought that experimental studies of 
long-continued selection are of little relevance to the practice of animal and plant 
improvement, because the breeder is concerned only with responses in the short 
term. Breeds of livestock, however, have already been under selection for a very 
long time and, in the case of broiler chickens, recent selection has been very intense. 
If they behave like some laboratory populations they might now be at selection limits 
and no longer responding to selection. If this were the case, an understanding of 
the nature of selection limits would be very relevant to the exploration of methods 
of overcoming the limit and making further progress. Fortunately, however, the main 
animal species do not seem to be at selection limits because continued improvement 
is being made (Smith, 1984, 1988; and see further at the end of the next chapter). 
For discussions of the bearing of laboratory experiments on animal breeding, see 
Roberts (1967a, 6), Falconer (1971), Eklund and Bradford (1977), Hill (19826) and, 
reviewing experiments with mice, Eisen (1980). 

Number of loci (effective factors) and standardized effects 
Since the total response depends primarily on the number of loci, it is tempting to 
try to use the observed response to estimate the number of loci that have contributed 
to it. There are, however, serious difficulties in interpreting any number so obtained, 
centring on what is meant by a locus in this context. There are two main difficulties. 
First, we do not know the form of the distribution of gene effects: do most genes 
have effects of more or less equal magnitude, or — which seems more likely — 
are there a few genes with large effects and increasing numbers with smaller and 
smaller effects? Then where do we stop counting a locus as one affecting the 
character? The second difficulty concerns linkage. What we count as a locus is a 
segment of chromosome that has not recombined in the course of the selection. In 
recognition of linkage, loci in this context are referred to as effective factors and 
their number as the effective number. Despite these difficulties it is worth while 
to consider briefly how the number of effective factors may be estimated. 

The effective number of loci can be estimated by equation [12.1] as n = R p /So\, 
where R T is the difference between the upper and lower selection limits, and <f A is 
the additive variance in the base population. The estimate made by equation [12.1] 
is valid on three conditions: (1) all the favourable alleles have been fixed at both 
limits; (2) all the genes have equal effects; and (3) all the genes have initial frequen¬ 
cies of 0.5. Failure of conditions (1) or (2) leads to the estimate of n being too low. 
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Failure of condition (3) leads to n being overestimated because o\ will then be less 
than it would be with gene frequencies of 0.5 as required. The requirement of con¬ 
dition (3) can be met by estimating a\ not in the base population but in the F 2 and 
subsequent generations of a cross between lines at the upper and lower limits. All 
genes by which the lines differ are then at frequencies of 0.5. But the problem of 
linkage is then magnified, since all relevant genes are in complete linkage dise¬ 
quilibrium in the Fj. With the reduction of disequilibrium by recombination, the 
genetic variance will decrease progressively in the generations following the F 2 and 
the number of effective factors will correspondingly increase. Some of the difficulties 
arising from the failure of the conditions can be overcome if data from F 1( F 2 and 
backcross generations are available (Lande, 1981). 

The numbers estimated by equation [12.1] are given for the four experiments in 
Table 12.2. They range from 2 to 98. The estimate of 2 for litter size seems too 
low to be believed, which suggests that the assumptions made were seriously in error. 
A similar experiment, but selecting only upward, yielded an estimate of n = 164 
on the assumption that all the genetic variation was due to recessive genes at initial 
gene frequencies of 0.25 (Eklund and Bradford, 1977). Analysis of the upward 
response of the experiment in Table 12.2 with the same assumptions gives n = 25. 
This large difference according to the assumptions made emphasizes the dubious 
value of these estimates of gene numbers. 

From the number of effective factors it is possible to obtain the standardized effects 
of the genes, subject to the same conditions and qualifications. Rearrangement of 
the expression for the additive variance given earlier, o 2 A = j na 2 , leads to the stan¬ 
dardized effects of the genes as 2 a/a P = 2h]{2!n), where h is the square root of 
the heritability. The values obtained are close to 0.2 in three of the experiments 
in Table 12.2; the litter size experiment gives a value of 0.95 for n = 2, or 0.21 
for n = 25. 

The use of marker genes in Drosophila makes it possible to localize and identify 
some of the genes that have contributed to a selection response. The situation to 
which these studies point is not one of a large number of genes all with more or 
less equal effect. It seems, rather, that a small number of genes with large effects 
are responsible for most of the response, the remainder of the response being due 
to a larger number of loci with small effects. For example, five loci accounted for 
87.5 per cent of the response of a line of Drosophila selected for high sternopleural 
bristle number (Spickett and Thoday, 1966). In another analysis of sternopleural 
bristles, 18 effective factors were found on the 3rd chromosome (Shrimpton and 
Robertson, 1988), with effects ranging from 1 la P down to the smallest effect that 
the analysis was capable of detecting. There were more effective factors with small 
effects than with large. Experiments of this sort do not provide estimates of the total 
number of genes or of effective factors, because the number found depends on the 
scale of the experiment; the more work done, the more effective factors are found. 
They do, however, give some evidence about the distribution of gene effects and 
they prove that much of the variation is due to a few genes with large effects and 
the rest to many genes with small effects. There is evidence also from selection 
experiments which point to the responses of abdominal bristles in Drosophila being 
due partly to a few genes with large effects (Frankham, Jones, and Barker, 1968; 
Yoo, 19806). 
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Problem 

12.1 Calculate the quantities listed in Table 12.2 from the results of the following 
experiment. (The last two quantities will have to be based on a guess.) Mice were 
selected for increased 3—6 week weight gain over 43 generations. There was no 
selection for decreased gain but there was an unselected control. The response was 
linear over the first 19 generations and reached a limit at about generation 34. The 
data needed are in the table, males and females being averaged. The figures in paren¬ 


theses are the generations to which the data refer. 

Realized heritability (1—19) 0.20 

Phenotypic standard deviation (1 — 19) 2.10 g 

Mean selection differential per generation (1 — 19) 2.25 g 

Mean of selected line (34-43) 23.55 g 

Mean of control line (34—43) 11.90 g 

Effective population size 33 


Data from Barria, N, & Bradford, G.E. (1981) J. Anim. Sci., 52, 729-38. 


[Solution 67] 



yi SELECTION: 

HI. Information from relatives 


In our consideration of selection we have up to now supposed that individuals are 
measured for the character to be selected and that the best are chosen to be parents 
in accordance with the individual phenotypic values. An individual’s own phenotypic 
value, however, is not the only source of information about its breeding value; addi¬ 
tional information is provided by the phenotypic values of relatives, particularly by 
those of full or half sibs. With some characters, indeed, the values of relatives pro¬ 
vide the only available information. Milk-yield, to take an obvious example, cannot 
be measured in males, so the breeding value of a male can only by judged from 
the phenotypic values of its female relatives. 

The use of information from relatives is of great importance in the application 
of selection to animal breeding, for two reasons. First, the characters to be selected 
are often ones of low heritability, and with these the mean value of a number of 
relatives often provides a more reliable guide to breeding value than the individual’s 
own phenotypic value. And, second, when the outcome of selection is a matter of 
economic gain, even quite a small improvement of the response will repay the extra 
effort of applying the best technique. In this chapter we shall outline the principles 
underlying the use of information from relatives and the choice of the best method 
of selection. 

If the family structure of the population is taken into account we can compute 
the mean phenotypic value of each family; this is known as the family mean. Sup¬ 
pose, then, that we have a population in which the individuals are grouped in families, 
which may be full or half sibs, and we have measurements of each individual and 
of the means of every family. How then is the additional information from the family 
means to be used? The problem may best be explained by reference to a specific 
example. Table 13.1 gives some hypothetical but realistic values of litter size in 
mice. There are 16 individuals whose phenotypic values are entered in the body 
of the table. The individuals are grouped in four full-sib families, A to D, with 4 
individuals in each family. We have to choose the best 4 of these 16 individuals. 
Basing the choice on the individual phenotypic values we have no difficulty in choos¬ 
ing individuals A1, B1, and A2 with values, 13, 11, 10 respectively. But now there 
are two with values of 9, B2 in a good family and D1 in a bad family. Which do 
we choose? The decision rests on whether the differences between families are mainly 
genetic or mainly environmental. If they are genetic we choose B2, on the grounds 
that its better family mean indicates a better breeding value. If, on the other hand, 
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Table 13.1 Examples of individual values and family 
means for selection, as explained in the text. 


Individual 

Family 



A 

B 

C 

D 

1 

13 

11 

7 

9 

2 

10 

9 

7 

5 

3 

8 

6 

6 

3 

4 

5 

6 

4 

3 

Family mean 

9 

8 

6 

5 

Overall mean 


7 




the differences between families are mainly environmental we would choose Dl, 
on the grounds that its low family mean indicates a poor environment and that it 
has performed well despite this disadvantage. The problem is not only in 
discriminating between individuals with the same phenotypic values, but is a matter 
of finding the right weight to be given to the family means. With the correct weighting 
we might be led to choose A3 with 8 in place of B2 with 9. Application of the prin¬ 
ciples to be developed shows that this would in fact be the best procedure if these 
values were litter sizes of mice (See Example 13.1). 

To calculate the best weighting of the family means, only three things need be 
known: the kind of family (whether full or half sibs), the number of individuals in 
the families (i.e. the family size), and the phenotypic correlation between members 
of the families with respect to the character. The information needed to solve what 
seems a complex problem is thus surprisingly simple; but the explanation of the 
underlying principles is not so simple. The explanation will be presented in two ways. 
First we shall extend the concept of heritability as a determinant of the response 
to selection. This introduces no new principles and leads fairly easily to a solution 
of the problem posed above; but it is not convenient for the solution of more com¬ 
plex problems found in practice. Then, under the heading of ‘Index selection’, a 
more general solution will be briefly explained. This allows information from dif¬ 
ferent sorts of relatives to be combined, for example from parents as well as sibs. 
It also allows information of a different kind to be used as an aid to selection, in 
a way to be explained in Chapter 19. 

Criteria for selection 

The phenotypic value of an individual, P, measured as a deviation from the population 
mean, is the sum of two parts: the deviation of its family mean from the population 
mean, P f , and the deviation of the individual from the family mean, P w (the within- 
family deviation); so that 

P = P f + P w ... [13.1] 

The procedure of selection, then, varies according to the attention paid, or the weight 
given, to these two parts. There are three simple procedures that can be followed. 
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First, we may select on the basis of individual values only, as assumed in the last 
two chapters, giving equal weight to the two components Pf and P w . This is known 
as individual selection. Second, we may select on the basis of the family mean Pf 
only, giving zero weight to the within-family deviation P w . This is known as family 
selection. Applied to Table 13.1, all four individuals in family A would be selected. 
Third, we may select on the basis of the within-family deviation P w alone, giving 
zero weight to the family mean Pf. This is known as within-family selection. Applied 
to Table 13.1, the best individual in each of the four families would be selected. 

Instead of one or other of these three simple procedures, we may take account 
of both components, Pf and P w , but give them different weights chosen so as to 
make the best use of the two sources of information. This is known as selection 
by optimum combination or combined selection or, more generally, index selection. 
It represents the general solution for obtaining the maximum rate of response, and 
the other three simpler methods are special cases in which the weights given to the 
two sources of information are either 1 or 0. It is therefore in principle always the 
best method. The appropriate weighting of Pf and P w will be explained later. 

Simple methods 

The salient features of the three simpler methods are as follows. 

Individual selection. Individuals are selected solely in accordance with their own 
phenotypic values. This method is usually the simplest to operate and in many cir¬ 
cumstances it yields the most rapid response. It should therefore be used unless there 
are good reasons for preferring another method. Mass selection is a term often used 
for individual selection, especially when the selected individuals are put together 
en masse for mating, as for example Drosophila in a bottle. The term ‘individual 
selection’ is used more specifically when the matings are controlled or recorded, 
as with mice or larger animals. 

Family selection. Whole families are selected or rejected as units, according to 
the mean phenotypic value of the family. Individual values are thus not acted on 
except in so far as they determine the family mean. In other words, the within-family 
deviations are given zero weight. The families may be of full sibs or half sibs, families 
of more remote relationship being of little practical significance. 

The chief circumstance under which family selection is to be preferred is when 
the character selected has a low heritability. The efficacy of family selection rests 
on the fact that the environmental deviations of the individuals tend to cancel each 
other out in the mean value of the family. So the phenotypic mean of the family 
comes close to being a measure of its genotypic mean, and the advantage gained 
is greater when environmental deviations constitute a large part of the phenotypic 
variance, or in other words when the heritability is low. On the other hand, environ¬ 
mental variation common to members of a family impairs the efficacy of family selec¬ 
tion. If this component is large it will tend to swamp the genetic differences between 
families, and family selection will be correspondingly ineffective. Another impor¬ 
tant factor in the efficacy of family selection is the number of individuals in the 
families, or the family size. The larger the family, the closer is the correspondence 
between mean phenotypic value and mean genotypic value. So the conditions that 
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favour family selection are low heritability, little variation due to common environ¬ 
ment, and large families. 

There are practical difficulties in the application of family selection, particularly 
in laboratory populations. They arise from the conflict between the intensity of selec¬ 
tion and the avoidance of inbreeding. It is generally desirable to keep the rate of 
inbreeding as low as possible. If the minimum number of parents is fixed by con¬ 
siderations of inbreeding — say at ten pairs — then under family selection ten families 
must be selected, since each family represents only one pair of parents in the previous 
generation. And, if a reasonably high intensity of selection is to be achieved, the 
number of families bred and measured must be perhaps twice to four times this 
number. Family selection is thus costly of space, and if breeding space is limited 
the intensity of selection that can be achieved under family selection may be quite 
small. The two following methods are variants of family selection. 

Sib selection. Some characters, as we have already noted, cannot be measured on 
the individuals that are to be used as parents, and selection can only be based on 
the values of relatives. This amounts to family selection but with the difference that 
now the selected individuals have not contributed to the estimate of their family mean. 
The difference affects the way in which the response is influenced by family size. 
Where the distinction is of consequence we shall use the term sib selection when 
the selected individuals are not measured, and family selection when they are 
measured and included in the family mean. When families are very large the two 
methods are equivalent, and the term family selection is then to be understood to 
cover both. 

Progeny testing is a method of selection widely applied in animal breeding. We 
shall not discuss it in detail, except in so far as it can be treated as a form of family 
selection. The criterion of selection, as the name implies, is the mean value of an 
individual’s progeny. At first sight this might seem to be the ideal method of selec¬ 
tion and the easiest to evaluate because, as we saw in Chapter 7, the mean value 
of an individual’s offspring comes as near as we can get to a direct measure of its 
breeding value, and it is in fact the operational definition of breeding value. In prac¬ 
tice, however, it suffers from the serious drawback of a much-lengthened genera¬ 
tion interval, because the selection of the parents cannot be carried out until the 
offspring have been measured. The evaluation of selection by progeny testing is apt 
to be rather confusing because of the inevitable overlapping of generations and because 
of a possible ambiguity about which generation is being selected, the parents or the 
progeny. The progeny, whose mean is used to judge the parents, are ready to be 
used as parents just when the parents have been tested and await selection. Thus 
both the selected parents and their progeny are used concurrently as parents. The 
difficulty of interpretation may be partially overcome by regarding progeny testing 
as a modified form of family selection. The progenies are families, usually of half 
sibs, and selection is made between them on the basis of the family means in the 
manner described above. The only difference is that the selected families are increased 
in size by allowing their parents to go on breeding. The additional, younger, members 
of the families do not contribute to the estimates of the family means and are therefore 
selected by sib selection. Increasing the size of the selected families by unmeasured 
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individuals does not improve the accuracy of the selection, but it reduces the replace¬ 
ment rate and so increases the intensity of selection that can be achieved. This is 
the principal advantage of progeny testing, but it can only be realized in operations 
on a large scale, when the danger of inbreeding is not introduced by limitation of 
space. 

Within-family selection. The criterion of selection is the deviation of each individual 
from the mean value of the family to which it belongs, those that exceed their family 
mean by the greatest amount being regarded as the most desirable. This is the reverse 
of family selection, the family means being given zero weight. The chief condition 
under which this method has an advantage over the others is a large component of 
environmental variance common to members of a family. Pre-weaning growth of 
pigs or mice might be cited as examples of such a character. A large part of the 
variation of individuals’ weaning weights is attributable to the mother and is therefore 
common to members of a family. Selection within families would eliminate this large 
non-genetic component from the variation operated on by selection. An important 
practical advantage of selection within families, especially in laboratory experiments, 
is that it economizes breeding space, for the same reason that family selection is 
costly of space. If single-pair matings are to be made, then two members of every 
family must be selected in order to replace the parents. This means that every fam¬ 
ily contributes equally to the parents of the next generation, a system that we saw 
in Chapter 4 renders the effective population size twice the actual. Thus when selec¬ 
tion within families is practised, the breeding space required to keep the rate of in- 
breeding below a certain value is only half as great as would be required under 
individual selection. 

Prediction of response 

To evaluate the relative merits of the different methods of selection we have to deduce 
the response expected from each. There is nothing to be added here about individual 
selection to what was said in Chapter 11. The expected response was given in equa¬ 
tion [11.3] as R = ioph 2 , where i is the intensity of selection (i.e. the selection dif¬ 
ferential in standard deviations), o> is the standard deviation and h 2 the heritability 
of the phenotypic values of individuals. The response expected under family selec¬ 
tion or within-family selection is arrived at in an analogous manner. Under family 
selection, the criterion of selection is the mean phenotypic value of the members 
of a family, so the expected response to family selection is 

Rf = iofhj ■ ■ - [13.2] 

where i is the intensity of selection, oy is the observed standard deviation of family 
means, and hj is the heritability of family means. In the same way, the expected 
response to within-family selection is 

R w = iaji 2 w . • • (13.3] 

where a w is the standard deviation, and h 2 w the heritability of within-family 
deviations. 

Heritability. The concept of heritability applied to family means or to within-family 
deviations introduces no new principle. It is simply the proportion of the phenotypic 



234 


13 Selection 


variance of these quantities that is made up of additive genetic variance. These 
heritabilities can be expressed in terms of the heritability of individual values (which 
we shall continue to refer to simply as the heritability, with symbol h 2 ), the 
phenotypic correlation between members of families, and the number of individuals 
in the families, all of which can be estimated by observation. To arrive at the 
appropriate expressions we have to consider again how the observational components 
of variance are made up of the causal components, as explained in Chapters 9 and 
10 (see in particular Tables 9.5 and 10.4). First let us simplify matters by supposing 
that all families contain a large number of individuals, so that the means of all families 
are estimated without error. Consider first the phenotypic variance. The intraclass 
correlation t between members of families is the between-group component divided 
by the total Variance: t = a\la\. Therefore the between-group component can be 
expressed as o 2 B = to and the within-group component as a 2 w = (1 — t)o This 
expresses the partitioning of the phenotypic variance into its observational com¬ 
ponents. The total variance, written here as a\, is the phenotypic variance which 
we shall write as V P in the context of causal components. The partitioning of the 
additive variance between and within families can be expressed in the same way, 
in terms of the correlation of breeding values, r. Thus the additive variance between 
families is rV A and the additive variance within families is (1 — r)V A . The dual 
partitioning is summarized in Table 13.2. 

This partitioning of both the additive and the phenotypic variance leads at once 
to the heritabilities of family means and of within-family deviations, since these 
heritabilities are simply the ratios of the additive variance to the phenotypic variance. 
Thus, when the families are large, the heritability of family means is rV A /tV P , or 
(r/t)h 2 , since V A IV P is the heritability of individual values, h 2 . The values of r for 
different relationships were given in Table 9.3; for full sibs it is \ and for half sibs 
it is i In order to be able to discuss full-sib and half-sib families at the same time 
in what follows, we shall retain the symbol r in the formulae instead of inserting 
the appropriate values of \ or i 

The foregoing account of the heritabilities of family means and within-family devia¬ 
tions was simplified by the supposition of large families. The simplification is not 
justified in practice and we must now remove it by considering families of finite 
size. We shall, however, suppose that all families are of equal size. The number 
of individuals in a family has to be taken into consideration for the following reason. 
If selection is based on the family mean, or on the deviations from the family mean, 
then it is the observed mean that we are concerned with and not the true mean. In 
other words we are not concerned with the observational components of variance 
which we have hitherto discussed, but with the variance of the observed means and 
of the observed within-family deviations. The observed means of groups are subject 
to sampling variance which comes from the within-group variance. If there are n 


Table 13.2 Partitioning of the variance between and within families of large size. 


Observational component 

Additive variance 

Phenotypic variance 

Between families, o\ 

rV A 

tV P 

Within families, o 2 w 

(1 - r)V A 

(1 - t)V P 
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Table 13.3 Composition of observed variances with families of size n. 




Causal components 


Observed variance 

Observational 

components 

Additive 

Phenotypic 

Of family means, aj 

2 1 2 
o B o w 

n 

\ + (n - 1 )r„ 

V A 

n 

1 

S c 

4 

Of within-family 
deviations, a\ 

2 1 2 

(n - 1)(1 - r) r/ 

(« ~ l)d “ 0 I7 

° W ~ ° W 

n 

'A 

n 

* p 

n 


individuals in a group then the sampling variance of the group-mean is (1 /n)ow, 
where a 2 w is the component of variance within the group. Thus the variance of 
observed group-means is augmented by (1 /n)ow, and the variance of observed 
deviations within groups is correspondingly diminished by the same amount. The 
observed variances, with family size n, are therefore made up of the observational 
components as shown in Table 13.3. The causal components entering into the 
observed variances can now be found by translating the observational components 
into causal components from Table 13.2. They are shown in the two right-hand 
columns of Table 13.3. 

To find the heritabilities of family means and of within-family deviations, we have 
only to divide the additive component by the phenotypic component of the observed 
variances. Thus the heritability of family means is 


= 1 + <" 

s 1 + (n - l)f 


... [13.4] 


and the heritability of within-family deviations is 


hi 


1 - r 


1 - t 


... [13.5] 


Sib selection has to be distinguished from family selection, from which it differs 
in that the selected individuals are not measured. The appropriate heritability is best 
deduced by considering it as a regression in the manner of equation [10.2]. In this 
case it is the regression of the breeding values of the unmeasured individuals on 
the mean phenotypic value of their measured sibs. The covariance of these is simply 
the covariance of full or half sibs, i.e., rV A , and it is not affected by the numbers 
of either the measured or the unmeasured individuals for reasons explained in Chapter 
9. The regression is therefore rV A to\, where <7/> is the observed variance of the 
family means of the measured individuals as given in Table 13.3. Substitution gives 
the heritability of family means appropriate to sib selection as 


hi = 


1 + (n — 1 )t 


-h‘ 


... [13.6] 
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The heritabilities of the different methods of selection, whose derivations have now 
been explained, are listed in Table 13.4. 


Expected responses. To deduce the expected responses is now a simple matter. 
Family selection will be taken for illustration. The expected response was given in 
equation [13.2] as Rf = iofhj. This expression, however, is not much use as it 
stands, because it does not readily allow a comparison to be made with the other 
methods. It will be most convenient to cast it into a form that facilitates comparison 
with individual selection. This can be done by substituting the expression for the 
heritability of family means, hj, given in equation [13.4], and by putting the stan¬ 
dard deviation of observed family means, oy, in terms of the standard deviation of 
individual phenotypic values, o P (=s!v P ) from the right-hand column of Table 13.3. 
The expected response then becomes 


R f = i 


1 + {n — 1 )t 
n 


o P 


1 + (n - l)r 2 

- h 

1 + (n — 1 )t 


which reduces to 


Rf = ioph 


1 + (n - 1 )r 


V[n(l + (n - 1)/)] 


The term ioph 2 is equivalent to the expected response under individual selection, 
so the expression within the square brackets is the factor that compares family selec¬ 
tion with individual selection. The expression looks very complicated but it contains 
only three simple quantities: n, which is the family size; r, which is ? for full-sib 
and I for half-sib families; and t, which is the phenotypic intraclass correlation. 

The expected responses under the different methods of selection are listed in Table 
13.4, all expressed in this manner which allows the comparisons to be made with 
individual selection. 


Combined selection 

Combined selection will be dealt with very briefly here because it will be more fully 
explained later. The appropriate weighting factors to be used in its application can 
be deduced as follows. We saw before that the phenotypic value of an individual 
is made up of two parts, the family mean and the within-family deviation, P = P f 
+ P w , and that each part gives some information about the individual’s breeding 
value. In Chapter 10 we saw that the heritability is equivalent to the regression of 
breeding value on phenotypic value (equation [10.2]), so that the best estimate of 
an individual’s breeding value to be derived from its phenotypic value is h 2 P (equa¬ 
tion [10.4]). This idea can be applied separately to the two parts of the phenotypic 
value, since these are uncorrelated and supply independent information about the 
breeding value. Therefore, taking both parts of the phenotypic value into account, 
the best estimate of an individual’s breeding value is given by the multiple regres¬ 
sion equation 


expected breeding value = h 2 Pf + h 2 ,P w 


■ . . [13.7] 



(Criteria for selection 


237 


Table 13.4 Heritability and expected response under different methods of selection. 


fiethod of 

selection Heritability 


Expected response 


Individual h 2 


Family hj = h 2 — 

* * 

Sib h 2 = h 2 

Within-family h 2 w = h 2 
Combined — 


1 + (n - l)r 


+ (n — l)r 
nr 

1 + (n - l)r 

(1 ~ r) 

(1 - t) 


R = io P h 2 

, 1 + (n - l)r 

R f = io P h 2 -—— 

1 F Vn{ 1 + (n - l)f} 


R ‘ = , ° rh VMl +(»-!»i 

0 || ft 1 

R w = io P h 2 (\ - r ) 

R c = iop h 2 


n( 1 - t) 

i + id—x < "- 1> 


(1-r) 1 + (n - l)r 


i = intensity of selection (selection differential in standard measure): assumed to be equal for all methods, 
but not necessarily so. 

Op = standard deviation of phenotypic values of individuals. 
k 2 = heritability of individual values 
r with full-sib families, r = 2 
with half-sib families, r = 4 

t = correlation of phenotypic values of members of the families. 
n = number of individuals in the families. 


Pf being measured as a deviation from the population mean, and P w as a deviation 
from the family mean. The weighting factors that make the most efficient use of 
the two sources of information are therefore the two heritabilities, which are the 
partial regression coefficients on family mean and within-family deviation respec¬ 
tively. If the values of the two heritabilities from Table 13.4 are inserted in equation 
[13.7], it will be seen that the term h 2 is common to both weighting factors, and 
this term may therefore be omitted without affecting the relative weighting. This 
gives the expected breeding value as 


E(A) = 


1 - r 
1 - t 


Pu, + 


1 + (n — l)r 
1 + (n - \)t 


In practice it is more convenient to work with the individual values in place of 
the within-family deviations, and to assign them a weight of 1. The family mean 
is thus used in the manner of a correction, supplementing the information provided 
by the individual itself. Rearrangement of the appropriate weighting factor for the 
family mean leads to an index of merit as follows (Lush, 1947): 


I = P + 


r - t 
1 - r 


X 


n 

1 + (n - 1 )f 


... [13.8] 


In this equation, P is the individual value, and Pf is the deviation of the family mean 
from the population mean, the individual itself being included in the family mean. 
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Note that the weighting of Pf is negative if t is greater than r. This can only occur 
when there is a large environmental component in the correlation; the family mean 
is then an indicator of environment rather than of breeding value. The expected 
response to combined selection, cast in a form suitable for comparison with individual 
selection, is given in Table 13.4. For its derivation see Lush (1947). 

Example 13.1 The operation of combined selection will be illustrated by application 
of equation [13.8] to the figures in Table 13.1, which are realistic values for litter sizes 
of mice. The phenotypic values are listed again here in table (i) with the family means 
as deviations from the overall mean. The full-sib correlation of litter size is about t = 
0.1. Substituting this, with r = 0.5, n = 4, in equation [13.8] gives the index of merit 
as I = P + 2.4 6Pf. The index so calculated for each individual is given in table (ii). 
The only difference between the individuals selected by combined selection and those 
selected by individual selection is in the 4th-ranking individual, A3 being chosen instead 
of B2 or Dl. 


Table (i) Table (ii) 


A 

B 

C 

D 

A 

B 

C 

D 

1 13 

11 

7 

9 

17.9 

13.5 

4.5 

4.1 

2 10 

9 

7 

5 

14.9 

11.5 

4.5 

0.1 

3 8 

6 

6 

3 

12.9 

8.5 

3.5 

-1.9 

4 5 

6 

4 

3 

9.9 

8.5 

1.5 

-1.9 

P f +2 

+ 1 

-1 

-2 





Overall mean 


7 




7 



Relative merits of the methods 

The merit of any one method of selection relative to any of the others can be worked 
out from the expected responses given in Table 13.4. The formulae expressing relative 
merits, however, are very cumbersome, depending in a complicated way on the 
phenotypic correlation t, the family size n, and whether the families are full or half 
sibs, i.e., r. The relative merits are summarized graphically in Fig. 13.1. This shows 
the expected responses of the three simple methods relative to combined selection, 



Fig. 13.1. Relative merits of the different methods of selection, with full-sib families. Responses 
relative to that for combined selection plotted against the phenotypic intraclass correlation, t. 
/= individual selection; F= family selection; W = within-family selection. 
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which must always be the best method. The relative responses are plotted against 
the phenotypic correlation t with the two extremes of family size: very large families 
in (a) and families of size 2 in (b). The intensity of selection i is assumed to be 
the same for all methods. The following conclusions can be drawn from the graphs 
though, for reasons to be explained, they are not immediately applicable to practical 
breeding operations. First, comparing the three simple methods with combined selec¬ 
tion, one or other of the simple methods is seldom more than 10 per cent, and never 
more than*20 per cent, below combined selection in the expected response. Second, 
comparing the three simple methods, individual selection is best over much of the 
range of t. The reason for this is that individual selection operates on the whole of 
the additive genetic variance, whereas family selection operates only on the variance 
between family means, and within-family selection only on the variance within 
families. Family selection is better than individual selection when the phenotypic 
correlation t is low. Low sib-correlations imply a low heritability and little 
resemblance from common environment. These are the conditions that, in general, 
make family selection better than individual selection. Within-family selection is 
better than individual selection when the sib-correlation is very high. Very high cor¬ 
relations can only arise from a very large common environment component of 
variance V Ec ; this is the condition that makes within-family selection better than 
individual selection, but it occurs only rarely. There are, however, other considera¬ 
tions. The first is inbreeding. Family selection is likely to result in fewer families 
being represented among the selected parents, unless the intensity of selection is 
correspondingly reduced. Consequently the rate of inbreeding is likely to be greater 
with family than with individual selection. Indeed, any use of family means, as with 
combined selection, tends to increase the rate of inbreeding. Conversely, within- 
family selection is likely to reduce the rate of inbreeding. The second consideration 
is the reduction of the additive variance explained in Chapter 11. As shown in Table 
11.1, it is only the between-family variance that is reduced by the disequilibrium 
generated by selection. Consequently within-family selection has the advantage over 
between-family selection in that it operates on a larger amount of additive variance. 
The third consideration is economic, and this is what makes the comparisons in Fig. 
13.1 not immediately relevant to practical breeding operations. The cost of obtain¬ 
ing both individual and family records may make combined selection uneconomic 
or impracticable. In animal breeding, individual records are easily obtainable but 
pedigree records, needed to apply combined selection, may be costly to obtain. The 
choice is between the less efficient individual selection and the more costly com¬ 
bined selection. In plant breeding, on the other hand, families may be easily ident¬ 
ified but individual plant records may be difficult to obtain, or they may not be 
meaningful. In this case family selection may be the only practicable method. For 
details of family selection applied to plants, see England (1977). 

Example 13.2 To illustrate how the different methods of selection are compared, their 
relative merits will be calculated for three characters in mice. The families are assumed 
to be single litters. Weaning weight has a very large common environment component 
giving a high full-sib correlation of 0.8. Six-week weight has a smaller, but still large, 
common environment component giving a correlation of 0.6 (see Example 10.5). Litter 
size, a character of the adult female, has a low heritability and a small amount of com¬ 
mon environment, giving a low full-sib correlation of 0.1. The intensity of selection 
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is assumed to be the same for all methods of selection, though this would not be true 
in practice. The family size is assumed to be n = 4. The families are full sibs, so r 
= 0.5. The expected responses relative to individual selection are calculated from Table 
13,4, by entering these values of n and r and the appropriate value of t. The relative 
responses are given in the table. Combined selection would be nearly 20 per cent better 
than individual selection for weaning weight and litter size, but would have virtually 
no advantage for 6-week weight. Family selection would be 10 per cent better than in¬ 
dividual selection for litter size. (If males were to be selected they would have to be 
selected by sib selection whatever method was applied to the females). Within-family 
selection would not be better than individual selection for any of the characters, but it 
would be for weaning weight if the family size was more than 5. 


Weaning 6-week 

weight weight Utter size 


t 

0.8 

0.6 

0.1 

Weighting of /^in eq. [13.8] 

-0.71 

-0.29 

+2.46 

Combined: R c /R 

1.18 

1.01 

1.19 

Family: Rf/R 

0.68 

0.75 

1.10 

Within family. R w /R 

0.97 

0.68 

0.46 


Example 13.3 A comparison of sib selection with individual selection has been made 
experimentally with Drosophila (Clayton and Robertson, 1957). Sib selection was made 
with both full-sib and half-sib families. The responses were compared with individual 
selection with intensity i = 1.40 as given in Example 11.2. The table gives the data 
needed to calculate the expected responses, relative to individual selection, from the form¬ 
ulae in Table 13.4. It will be seen that r for the half sibs was a little greater than 0.25. 
This was because the females mated to the same male were not entirely unrelated to 
each other. The relative responses expected and observed are given in the right-hand 
part of the table. The expectation is that sib selection should be less good than individual 
selection, and so it proved to be. There was, however, some discrepancy between the 
upward and downward responses, for which the reason is not known. 



Data 


Relative response, R s /R 


Full sibs 

Half sibs 


Full sibs 

Half sibs 

i 

1.33 

1.27 

Exp. 

0.832 

0.614 

n 

12 

20 

Obs. up 

0.618 

0.527 

r 

0.50 

0.275 

Obs. down 

0.919 

0.635 

t 

0.265 

0.121 





Index selection 

The optimal procedure for selection uses all the information available about each 
individual’s breeding value, combined into an index of merit. The solution given 
above for combining the family mean and within-family deviation is not readily applic¬ 
able to more complex situations when there may be more than two sources of infor¬ 
mation. There may, for example, be information from the individual, its parents, 
full sibs, half sibs, and other relatives. Or, if the character is limited to one sex, 
the information about individuals that cannot themselves be measured will come only 
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from relatives of different sorts. The aim therefore is to combine all the information 
into an index on the basis of which the individuals will be selected. The construction 
of an index is not easy without the use of matrix methods, particularly if there are 
more than two sources of information. The technical details are beyond the scope 
of this book and only a brief account of the principles involved can be given. For 
more detailed accounts see Nordskog (1978), Lin (1978), and Nicholas (1987). 

Construction of an index 

The index is the best linear prediction of an individual’s breeding value and it takes 
the form of a multiple regression of breeding value on all the sources of informa¬ 
tion. Consider first the simplest situation where the only information we have is 
the individual’s own phenotypic value P as a deviation from the population mean. 
Then the predicted, or expected, breeding value is E(A) = b AP P, where b AP , is the 
regression of breeding value on phenotypic value. In this case b AP = h 2 (equation 
[10.2]). Now suppose that there are several pieces of information, P it P 2 , P 3 , etc., 
where each P is the phenotypic value of an individual or a group of relatives. These 
pieces of information in the form of phenotypic values will be referred to as 
‘measurements’; all are to be expressed as deviations from the population mean. 
The index of an individual is then 

/ = b x P x + b 2 P 2 + b 3 P 3 + ... ... [13.9] 

in which the b ’s are the factors by which each measurement is to be weighted. The 
problem is to find the best value for each weighting factor. This is done by finding 
what values will give the maximum correlation r IA between the index and the 
breeding value. Maximizing r lA is equivalent to minimizing the sum of squared 
deviations of index values from the linear regression of / on A, i.e., S(7 - A) 2 . 
The resulting values of the b ’s are then the partial regression coefficients of the in¬ 
dividual’s breeding value on each measurement. The maximizing of r lA is a stan¬ 
dard procedure for calculating partial regressions and it need not be explained here. 
The maximization leads to a set of simultaneous equations, with as many equations 
as there are measurements, and the solution of these equations gives the values of 
the Z>’s to be used in equation [13.9]. These index equations for solution are given 
below (equations [13.10]). The equations for three measurements are given; it will 
easily be seen how they would be extended or reduced for more or fewer 
measurements. Each equation relates the phenotypic variances and covariances of 
the measurements, on the left, to the additive genetic variances and covariances of 
the individuals measured, on the right. The notation is condensed as follows. P means 
the phenotypic variance or covariance of the measurements denoted by subscript 
numbers. For example, P u is the phenotypic variance of measurement 1, and P 12 
is the phenotypic covariance of measurements 1 and 2. The variances and covariances 
of breeding values are similarly written as A. The equations are 

b\P\\ + b 2 P x2 + b 3 P\ 3 = A u 
b\P 2 \ + b 2 P 22 + b 3 P 23 = A 2 x 
b\P 3 \ + b 2 P 32 + b 3 P 33 = A 3 i 

To solve these equations the numerical values of the P’s and A ’s must of course 
be inserted. The P ’s and A *s can all be expressed in terms of the following parameters: 


1 ...[13.10] 



242 


13 Selection 


the phenotypic variance V P , which will be denoted here by cr 2 ; the heritability of 
individual values, h 2 ; the phenotypic correlations between individuals, t; and the 
coefficients of relationship, r as in Table 9.3. When ‘measurements’ are the means 
of groups of individuals, the number n in the group is also needed. There are stan¬ 
dard computer programs for solving the equations. The indices obtained will be 
illustrated by reference to specific examples, simplified by considering only two 
measurements. Some other examples are dealt with by Becker (1984). 

Individual and one relative. Let measurement 1 be of the individual whose index 
is to be calculated, and measurement 2 be that of one relative. Then the phenotypic 
variances of the measurements are the same and P u = P 2 2 = a 2 . The phenotypic 
covariance is P l2 = P 2 \ = to 2 . The additive variance is A u = h 2 o 2 and the 
additive covariance is A 2i = rh 2 o 2 , where r is the coefficient of relationship 
between the relative and the individual. After dividing all through by a 2 , the equa¬ 
tions for solution become 


b\ + tb 2 = h 2 
tb\ + b 2 = rh 2 

and the solution, after some simplification, is 

, _ h2 (\ ~rt) , h\r - t) 

b\ ~ . 2 > b 2 — - 5 

1 - t 2 1 - t 2 

The index is obtained by substituting the values b x and b 2 into equation [13.9]. 

For the practical purpose of choosing individuals it may be convenient to rescale 
the index so that the individuals’ own measurements are given a weight of 1. The 
weight to be given to P 2 is then b 2 lb x = (r—r)/(l — rt), and the rescaled index is 

(Bih 


With the index in this form the actual value of Pi can be used, but P 2 must be a 
deviation from the population mean; the value of the index is then in the units of 
measurement. In its original form the index predicts the individual’s breeding value; 
in the rescaled form it is a phenotypic value adjusted by the information on the relative. 
Any index can be similarly rescaled. It must be noted, however, that if the response 
to index selection is to be predicted, allowance must be made for any rescaling of 
the index. The prediction of the response will be explained later. 


Mother and one paternal half sister. This exemplifies the selection of males for 
a female character, such as milk-yield or egg production. The individual whose index 
is to be calculated is not measured. To make the index equations comparable with 
those of equations [13.10] we shall regard measurement 1 as being absent, measure¬ 
ment 2 as that of the mother, and measurement 3 as the half sister. The relevant 
parts of equations [13.10] are therefore 

b 2 Pi2 + b 2 P 2 3 = d 2 \ 

^2^32 + b 2 P 22 = /4 31 
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The phenotypic variances are again equal, and P 22 = P 33 — o 2 . The mother and 
half sister will be assumed to be unrelated and uncorrelated environmentally, so P 2 3 
= p 32 = 0 . A 2 1 is the additive covariance of the individual with his mother, which 
is \h 2 o 2 . A 3] is the additive covariance of the individual with his half sib, which 
is i h 2 o 2 . The index equations then reduce directly to the solutions 

b 2 = 2 h 2 ; b 2 = 4 h 2 


giving the index 


/ = i h 2 P x + i- h 2 P 2 


Individual and mean ofsibs. This is the situation with which the problem was intro¬ 
duced at the beginning of this chapter, and described earlier as ‘combined selec¬ 
tion’. The individual is again measurement 1, so P tl = o 2 and A n = h 2 a 2 as 
before. The variance of a family mean was given in Table 13.3, from which P 22 
= [1 + (n — 1 )t]a 2 /n. It is usually convenient to include the individual in the 
family mean. The covariance of the individual with the family mean is then equal 
to the variance of family means. Thus P 12 = P 2 \ = Pyi- In the same way the 
additive covariance is equal to the additive variance of family means. Taking the 
additive variance from Table 13.3 gives A 2l = [1 + (n — 1 )r]h 2 o 2 /n. The index 
equations are thus 


b } + Kb 2 = h 1 
Kb 1 + Kb 2 = kh 2 


where K = 


1 + (n — l)f 
n 


and 


, 1 + in - l)r 

k = - 

n 


Solving for b x and b 2 gives the index 

h 2 { 1 ~k) h\k — K) 

I = —--- P, + —--- P 2 

(1 - K) K( 1 - K) 

If different individuals have different numbers of sibs, k and K must be evaluated 
separately for each individual. If the number of sibs is constant, the index can be 
rescaled so as to become the same as equation [13.8]. 

The usefulness of the information obtainable from sibs depends on the family size 
n. Because there are usually many more half sibs than full sibs, half-sib families 
may give more information than full sibs, despite the less close genetic relationship. 
The relative merits of indices for application to poultry are examined by Osborne 
(1957). 


Other ‘measurements ’ involving means. In applying the quantities denoted by K 
and k above in other situations, two points need to be noted. First, a group of relatives 
may be related to each other differently from the way that they are to the individual. 
Care must then be taken to use the appropriate values of t and r. For example, P 2 
might be the mean of a group of full sibs that are half sibs to the individual. Then 
the t appropriate to P 22 is the correlation of full sibs, but the t appropriate to Pi 2 
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and P 2 1 , and the r appropriate to A 2l , are the correlations of half sibs. Second, the 
‘measurement’ of an individual may be the mean of several repeated records. Then 
the repeatability must be used in place of t. 


Accuracy 

The correlation r IA between index values and breeding values, which is maximized 
in the construction of an index, is known as the accuracy of the index. It provides 
a convenient way of comparing indices, or any criteria of selection, because the 
higher the correlation the better is the criterion as a predictor of breeding values. 
With individual selection the criterion is simply the individual’s phenotypic value. 
The accuracy of individual selection is therefore the correlation of phenotypic values 
with breeding values, which is h, the square root of the heritability (equation [10.3]). 
The accuracy r IA of an index is calculated as follows. 

First, we need to know the variance of index values. From equation [13.9] this 
can be seen to be 

aj = b\P n + b 2 P 22 + • • • + 2bib 2 P\2 + • • • ... [13.11] 

where variances and covariances are written in the notation of equations [13.10]. 
This expression can be put into a form that is easier to calculate. Rearranging the 
terms gives 

of — b\(b\P\\ + b 2 P\ 2 F . . .) F b 2 (b\P 2 \ + b 2 P 22 F . . .) + ... 

Substituting for the terms in brackets from equations [13.10] leads to 

of = b\A\\ -t- b 2 A 2 \ F ^ 3^431 + . . . ... [13.12] 

Next we need to know the covariance of index values with breeding values. This 
is cov^ = aj for the following reason. The construction of the index so that the 
weighting factors, the £’s, are partial regression coefficients results in the regres¬ 
sion of breeding values on index values being unity; i.e., b AI = 1. Another way 
of saying this is that, in its construction, the index is scaled so that 1 unit of the 
index is equivalent to 1 unit of predicted breeding value. Now, co\ A! /af = 
b A [ = 1, from which it follows that co\ A/ = of. Thus, provided the index has not 
been rescaled, the correlation is given by 


cov lA Oj 

Ha = - = “ 

a I a A °A 


[13.13] 


Here 07 is obtained from equation [13.12] and o A is the square-root of the additive 
genetic variance in the population. 

The correlation r IA is a multiple correlation, and its square expresses the fraction 
of the additive variance that is accounted for by the measurements combined in the 
index, i.e., r] A = aj/a 2 A . Conversely, (1 - rj A ) is the fraction of the additive 
variance that is not taken account of in the index. This shows how much room there 
is for improvement of the index by inclusion of additional measurements. 


Response to selection 

The response to selection is the mean breeding value of the selected parents, which 
is predicted from the regression of breeding values on index values as R = b At S , 
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where S is the selection differential of index values. Putting b AI = r IA o A /oj and 
S = ioj gives the predicted response as 

R = ir IA a A • - - [13.14] 

This prediction applies to any index, however constructed and whether optimal or 
not, provided r IA refers to the index actually used. If the index has not been re¬ 
scaled, substitution for r IA can be made from equation [13.13] to give 

R — ioj ...[13.15] 

Comparison with individual selection. In a practical breeding operation the first 
question about index selection is whether it is worth the cost of recording pedigrees, 
without which no information about families is available. The alternative is individual 
selection, whose accuracy is h. The accuracy of index selection relative to individual 
selection is therefore r[ A /h. This is also the ratio of the two expected responses in 
equations [13.14] and [11.4] if the intensities of selection are the same. 

Practice. When the theory of indices is applied in practical breeding it is obviously 
desirable to make use of all the information on all the individuals for which there 
are records of pedigree and phenotypic value. The index will have to incorporate 
information from relatives of several different sorts, with appropriate weighting for 
famil y sizes, and there will usually also be fixed effects such as age, herds, or years, 
to be estimated and incorporated in the analysis. These and other complexities, 
however, are beyond the scope of this book. The procedure used is known as BLUP 
(Best Linear Unbiased Prediction). It provides estimates of the mean, the fixed effects, 
and the predicted breeding values of all the individuals available for selection, but 
prior knowledge of the heritability is required. For accounts of the procedure see 
Henderson (1977, 1988 and references therein). A simple explanation is given by 
Nicholas (1987, p. 471). Reviews of the associated methodology are given by Thomp¬ 
son (1979) and Kennedy (1981). If the heritability is not known in advance the pro¬ 
cedure known as REML, which was mentioned in Chapter 10, is the preferred method 
for estimating the population parameters and the predicted breeding values. 

Indices for predicting breeding values can be improved by the incorporation of 
data on other characters correlated with the one for which prediction is required, 
and which may themselves be objects of selection. This aspect of indices is dealt 
with in Chapter 19. 

Actual achievements 

Before leaving the subject of selection we should consider how much improvement 
of farm animals has actually been achieved by the application of selection in the 
recent past. This question has been reviewed by Smith (1984, 1988). Genetic improve¬ 
ment has been assessed in poultry for weight gain in broilers and egg number in 
layers, in pigs and sheep for aspects of body weight and for litter size, in cattle for 
weight and milk-yield. All of these show improvement. Expressed as percentages 
of the means, the rates are mostly in the range from 1 to 2 per cent per year (not 
per generation). Broiler poultry, however, showed 6 per cent improvement and beef 
cattle only 0.3 per cent. Except for the broilers these rates were somewhat below 
what should have been possible. The possible rates were calculated from the known 
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heritabilities, the accuracies of the selection methods, and the intensities of selec¬ 
tion. The possible rates of improvement ranged between 1.4 and 3.2 per cent per 
year. The conclusion is that substantial progress is undoubtedly being made, but 
it should be possible to increase the rates by improved procedures. 

Problems 

13.1 Calculate the heritability of family means, hj, and of within-family 
deviations, h 1 2 3 4 W , for characters with the following parameters. Calculate also the ex¬ 
pected resonses to family selection and to within-family selection, relative in each 
case to the response to individual selection, i.e. R f /R and R w /R, assuming equal in¬ 
tensities of selection. 



Individual 

heritability 

Type of 
family 

Sib 

correlation 

Family 

size 

(1) 

0.1 

Half sibs 

0.025 

10 

(2) 

0.1 

Half sibs 

0.025 

20 

(3) 

0.1 

Full sibs 

0.5 

4 

(4) 

0.2 

Full sibs 

0.8 

4 

(5) 

0.2 

Full sibs 

0.8 

8 


[Solution 77] 

13.2 Daily weight gain in British Large White pigs has a half-sib correlation of 
0.10 and a full-sib correlation of 0.36. Compare the expected rates of progress, 
relative to that of individual selection, when selection is based on 

(1) The mean of 5 half sibs, all from different dams, the selected individual being 
one of the five. 

(2) The mean of 5 full sibs, which include the selected individual. 

(3) The mean of 5 full sibs, which exclude the selected individual. 

(4) The individual’s deviation from the mean of its 5 full sibs, which include itself. 

Data from Smith, C. et al. (1962) Anim. Prod., 4, 128-43. 


[Solution 87] 

13.3 What would be the appropriate index for selecting pigs for daily weight gain 
on the basis of the individual’s gain and the family mean of 5 full sibs, the individual 
being included in the family mean? Take the full-sib correlation to be 0.36 as in 
Problem 13.2. 

[Solution 97] 

13.4 If the figures below were the daily weight gains of four individual pigs from 
different full-sib families, what would be their order of merit according to the index 
worked out in Problem 13.3 with any necessary modification for the numbers of sibs? 
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Pig 

Weight gain 

Individual 

Family mean 

Number in family 

A 

1.6 


1.3 

5 

B 

1.5 


1.6 

5 

C 

1.5 


1.6 

8 

D 

1.3 


1.7 

8 

Population mean = 

1.5 




[Solution 107] 

13.5 If selection for daily weight gain in pigs were applied by the index calculated 
in Problem 13.3, how would the expected response compare with the response 
expected from individual selection? 

[Solution 117] 

13.6 Construct an index for selecting bulls for milk yield on the basis of the yields 
of the mother and 10 paternal half sisters. Assume that the half sisters all have dif¬ 
ferent mothers which are not related to the bull’s mother. Assume also that there 
is no environmental correlation between half sibs. Take the heritability of milk yield 
to be 0.35 (Table 10.1), though this is higher than most estimates. 

[Solution 127] 

13.7 Predict the rate of improvement of milk yield per generation if 1 in 20 bulls 
were selected by the index of Problem 13.6, and 1 in 2 cows were selected on their 
own yield. Take the phenotypic standard deviation of milk yield to be 696 kg 
(Example 8.7), and assume that the selection is made from a large number measured. 

[Solution 137] 



INBREEDING AND CROSSBREEDING: 
I. Changes of mean value 


We turn our attention now to inbreeding, the second of the two ways open to the 
breeder for changing the genetic constitution of a population. The harmful effects 
of inbreeding on reproductive rate and general vigour are well known to breeders 
and biologists, and were mentioned in Chapter 6 as one of the two basic genetic 
phenomena displayed by metric characters. The opposite, or complementary, 
phenomenon of hybrid vigour resulting from crosses between inbred lines or be¬ 
tween different races or varieties is equally well known, and forms an important 
means of animal and plant improvement. The production of lines for subsequent 
crossing in the utilization of hybrid vigour is one of two main purposes for which 
inbreeding may be carried out. The other is the production of genetically uniform 
strains, particularly of laboratory animals, for use in bioassay and in research in 
a variety of fields. Inbreeding in itself, however, is almost universally harmful and 
the breeder or experimenter normally seeks to avoid it as far as possible, unless 
for some specific purpose. 

In the treatment of inbreeding given in Chapter 3, the consequences were des¬ 
cribed in terms of the changes of gene frequencies and of genotype frequencies. 
Here we have to show how these changes of gene and genotypic frequencies affect 
metric characters, and how they can account for the observed effects of inbreeding 
and crossing. The effects on the mean value will be explained in this chapter and 
the effects on the variance in the next. In Chapter 16 we shall consider the use of 
inbreeding and crossing as a means of plant and animal improvement. 

The effects of inbreeding to be described do not apply to naturally self-fertilizing 
plants. Since inbreeding is their normal mating system they cannot be further in- 
bred. They can, however, be crossed and they do then often show hybrid vigour, 
though less than when inbred lines of outbreeding species are crossed. The improve¬ 
ment of self-fertilizing plants, for which crossing is the first step, will be described 
briefly in Chapter 16. 

Inbreeding depression 

The most striking observed consequence of inbreeding is the reduction of the mean 
phenotypic value shown by characters connected with reproductive capacity or 
physiological efficiency, the phenomenon known as inbreeding depression. Some 
examples of inbreeding depression are given in Table 14.1, from which one can 
see what sort of characters are subject to inbreeding depression, and — very roughly 
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Table 14.1 Some examples of inbreeding depression. Approximate decrease of mean 
per 10 per cent increase of inbreeding coefficient: (1) in absolute units; (2) as percen¬ 
tage of non-inbred mean; and (3) as percentage of the original phenotypic standard devia¬ 
tion. The depression given is due only to inbreeding in the individuals on which the 
characters are measured, except where noted below. 




(1) 

Units 

(2) 

% of 
M 

(3) 

% of 
Op 

Man 

Height (cm) at age 10; [Schull, 1962] 


2.0 

1.6 

37 

IQ score (percentage points); [Morton, 1978] 


4.4 

4.4 

29 

Cattle 

Milk-yield (kg): [Robertson, 1954] 


135 

3.2 

17 

Sheep [Morley, 1954] 

Fleece weight (kg) 


0.29 

5.5 

51 

Body weight at 1 yr (kg) 


1.32 

3.7 

36 

Pigs [Bereskin et al., 1968] 

Litter size (no. bom alive) 

(a) 

0.24 

3.1 

9 

Body weight at 154 days (kg) 


2.6 

4.3 

15 

Mice 

Litter size: [Bowman and Falconer, 1960] 

(b) 

0.56 

7.2 

23 

Body weight at 6 wks (g): [White, 1972] 

Maize [Cornelius and Dudley, 1974] 

(c) 

0.19 

0.6 

7 

Plant height (cm) 

(FS) 

5.20 

2.1 

4 

(S) 

5.65 

2.3 

5 

Yield of seed (g/plant) 

(FS) 

7.92 

5.6 

25 

(S) 

9.65 

6.8 

30 


(a) Inbreeding in the mothers; litters non-inbred. 

(b) Depression related to inbreeding in the mothers under consecutive full-sib mating; litters one 
generation more inbred than mothers. 

(c) Depression related to inbreeding in the plants measured; inbreeding by consecutive full-sib 
mating (FS) or selfing (S). Dr. J. W. Dudley kindly provided the values of o P for maize. 


— the magnitude of the effect. From the results of these and many other studies 
we can make the generalization that inbreeding tends to reduce fitness. Thus, 
characters that form an important component of fitness, such as litter size or lacta¬ 
tion in mammals, show a reduction on inbreeding; whereas characters that are not 
closely connected with fitness show little or no change. In Drosophila, for example, 
bristle number and body weight do not change (Kidwell and Kidwell, 1966) but fer¬ 
tility and viability do (Tantawy and Reeve, 1956). 

In saying that a certain character shows inbreeding depression, we refer to the 
average change of mean value in a number of lines. The separate lines are com¬ 
monly found to differ to a greater or lesser extent in the change they show, as, in¬ 
deed, we should expect in consequence of random drift of gene frequencies. This 
matter of differentiation of lines will be discussed later when we deal with changes 
of variance. It is mentioned here only to emphasize the fact that the changes of mean 
value now to be discussed refer to changes of the mean value of a number of lines 
derived from one base population. As in our earlier account of inbreeding we have 
to picture the ‘whole population’ consisting of many lines. The population mean 
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then refers to the whole population, and inbreeding depression refers to a reduction 
of this population mean. Let us now consider the theoretical basis of the change 
of population mean on inbreeding. 

First, we may recall and extend some of the conclusions from Chapter 3, suppos¬ 
ing at first that selection does not in any way interfere with the dispersion of gene 
frequencies. Since the gene frequencies in the population as a whole do not change 
on inbreeding, any change of the population mean must be attributed to the changes 
of genotype frequencies. Inbreeding causes an increase in the frequencies of 
homozygous genotypes and a decrease of heterozygous genotypes. Therefore a change 
of population mean on inbreeding must be connected with a difference of genotypic 
value between homozygotes and heterozygotes. Let us now see more precisely how 
the population mean depends on the degree of inbreeding. 

Consider a population, subdivided into a number of lines, with a coefficient of 
inbreeding F . The expression for the population mean is derived by putting together 
the reasoning set out in Tables 3.1 and 7.1, in the following way. Table 14.2 shows 
the three genotypes of a two-allele locus with their genotypic frequencies in the whole 
population. These frequencies come from Table 3.1, p and q being the gene fre¬ 
quencies in the whole population. Then the third column gives the genotypic values 
assigned as in Fig. 7.1. The value and frequency of each genotype are multiplied 
together in the right-hand column, the summation of which gives the contribution 
of this locus to the population mean. Thus, referring still to the effects of a single 
locus, we find that a population with inbreeding coefficient F has a mean genotypic 
value: 

M f = a(p - q) + 2dpq{\ - F) ... [14.1] 

= M 0 - IdpqF . . . [14.2] 

where M 0 is the population mean before inbreeding, from equation [7.2], The 
change of mean resulting from inbreeding is therefore -IdpqF. This shows that 
a locus will contribute to a change of mean value on inbreeding only if d is not zero; 
in other words if the value of the heterozygote differs from the average value of 
the homozygotes. This conclusion, though demonstrated in detail only for two alleles 
at a locus, is equally valid for loci with more than two alleles. The following general 
conclusions can therefore be drawn: that a change of mean value on inbreeding is 
a consequence of dominance at the loci concerned with the character, and that the 
direction of the change is toward the value of the more recessive alleles. The 


Table 14.2 


Genotype 

Frequency 

Value 

Frequency x 

Value 

AjA, 

P 2 + PqF 

+ a 

p 2 a + pqaF 

AjA 2 

2 pq ~ 2pqF 

d 

2pqd — 2pqdF 

a 2 a 2 

q 2 + pqF 

— a 

~ q 2 a - PqaF 



Sum = a {p 
= a{p 

— q) + 2 dpq — IdpqF 
~ q) + 2dpq(l - F) 


Inbreeding depression 


251 


dominance may be partial or complete, or it may be overdominance; all that is 
necessary for a locus to contribute to a change of mean is that the heterozygote should 
not be exactly intermediate between the two homozygotes. Equation [14.2] shows 
also that the magnitude of the change of mean depends on the gene frequencies. 

It is greatest when pq is maximal: that is, when p = q = k- Genes at intermediate 
frequencies therefore contribute more to a change of mean than genes at high or 
low frequencies, other things being equal. 

Now consider the combined effect of all the loci that affect the character. In so 
far a's the genotypic values of the loci combine additively, the population mean is 
given by summation of the contributions of the separate loci, thus: 

M F = "La(p — q) + 2(Ldpq)(l — F) ... [14.3] 

= M 0 - 2FLdpq ... [14.4] 

and the change of mean on inbreeding is —2 FZdpq. 

These expressions show what are the circumstances under which a metric character 
will show a change of mean value on inbreeding. The chief one is directional 
dominance, which means dominance of the genes concerned being preponderantly 
in one direction. If the genes that increase the value of the character are dominant 
over their alleles that reduce the value, then inbreeding will result in a reduction 
of the population mean, i.e., a change in the direction of the more recessive alleles. 
The contribution of each locus, however, depends also on its gene frequencies, those 
with intermediate frequencies having the greatest effect on the change of mean value. 

Another conclusion that can be drawn from equation [14.4] is that when loci com¬ 
bine additively the change of mean on inbreeding should be directly proportional 
to the coefficient of inbreeding. In other words the change of mean should be a straight 
line when plotted against F. If there is epistatic interaction between loci, the relation 
between the mean and the inbreeding coefficient is not linear. The non-linearity is 
due to the interaction deviation of double, or multiple, heterozygotes. The frequency 
of double heterozygotes declines in proportion to F 2 . Therefore as F increases, the 
rate of depression of the mean increases if the interaction deviations are on average 
positive, i.e., favourable, and the rate decreases if they are negative. No other form 
of interaction affects the linearity, and epistasis without dominance cannot itself cause 
any inbreeding depression. For the details of how epistasis affects inbreeding depres¬ 
sion, see Crow and Kimura (1970, p. 79). 

Examples of experimentally observed inbreeding depression are illustrated in Figs 
14.1 and 14.2. On the whole, the observed inbreeding depression, as in these 
examples, does tend to be linear with respect to F, and this might be taken as evidence 
that epistatic interaction between loci is not of great importance. There are, however, 
several practical difficulties that stand in the way of drawing firm conclusions from 
observations of the rate of inbreeding depression. One is that, as inbreeding pro¬ 
ceeds and reproductive capacity deteriorates, it soon becomes impossible to avoid 
the loss of some individuals and of some entire lines. The survivors are then a selected 
group to which the theoretical expectations no longer apply. Thus precise measure¬ 
ment of the rate of inbreeding depression can generally be made only over the early 
stages, before the inbreeding coefficient reaches high levels. Another difficulty, met 
with particularly in the study of mammals, arises from maternal effects. Maternal 
qualities are among the most sensitive characters to inbreeding depression. The effect 
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Fig. 14.1. Effects of inbreeding on plant-height and yield of seed in maize ( Zea mays). The dotted 
and dashed lines refer to consecutive selfing; the continuous lines refer to consecutive full-sib 
mating. No selection was practised. Data from Hallauer and Sears (1973) (dotted lines), and 
Cornelius and Dudley (1974) (continuous and dashed lines). 



Fig. 14.2. Examples of inbreeding depression affecting fertility, (a) Litter-size in mice (Data from 
Bowman and Falconer, 1960). Mean number bom alive in 1st litters, plotted against the 
coefficient of inbreeding of the litters. The first generation was by double-first-cousin mating; 
thereafter by full-sib mating. No selection was practised, (b) Fertility in Drosophila subobscura. 
Mean number of adult progeny per pair per day, plotted against the inbreeding coefficient of the 
parents. Consecutive full-sib matings. (Based on Hollingsworth and Maynard Smith, 1955.) 
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of inbreeding on another character that is influenced by maternal effects is therefore 
twofold: part being attributable to the inbreeding of the individuals measured and 
part to the inbreeding in the mothers. When continuous inbreeding is practised, the 
mothers and the offspring have different coefficients of inbreeding in the early genera¬ 
tions, so the relationship between the character measured and the coefficient of in- 
breeding cannot be expressed in any simple way. In consequence of these difficulties, 
reliable conclusions cannot easily be drawn from the exact form of the depression 
observed under continuous inbreeding. 

• 

The effect of selection 

The neglect of selection during inbreeding is an unrealistic omission because natural 
selection cannot be wholly avoided even in laboratory experiments, and because 
deliberate inbreeding is usually accompanied by some artificial selection for characters 
subject to inbreeding depression. There are two effects of selection which must be 
distinguished: delay in the approach to fixation, and reduction in the inbreeding 
depression. Delay in the approach to fixation was discussed at the end of Chapter 
5. It results in the actual proportion of homozygotes being less than would be predicted 
by the inbreeding coefficient calculated from the pedigree or the population size. 
Reduction of the inbreeding depression can occur without a delay in the approach 
to homozygosis if selection leads to the better allele being fixed at more of the loci 
than would occur by chance. To make effective use of selection in this way it is 
essential for a large number of lines to be inbred in parallel. Selection is then applied 
between the lines, so that the worst lines are eliminated and the best retained. The 
reason for this, which is explained in the next chapter, is that the genetic variation 
is progressively reduced within lines and increased between lines. 

How effective can selection be in counteracting inbreeding depression? This is 
an important practical question because, if highly inbred lines are to be used for 
any practical purpose, the individuals must be reasonably viable and fertile. Selec¬ 
tion has undoubtedly been successful in reducing, or even completely overcoming, 
the inbreeding depression of litter size in mice. If the graph in Fig. 14.2(a) is extra¬ 
polated to complete inbreeding at F = 100 per cent, the predicted litter size of highly 
inbred mice would be about 2. This is what would be expected of inbreeding without 
selection. In fact, most of the existing inbred strains have mean litter sizes well above 
2 , showing that selection in their development was successful in reducing the in- 
breeding depression. (Twenty strains listed by Green (1968) have a mean litter size 
of 5.7 with a range from 4.1 to 7.2.) Details of the making of one highly inbred 
strain with a mean better than the original population are given in the following 
example. 

Example 14.1 The experiment on litter size illustrated in Fig. 14.2(a) was started with 
20 lines. All females bom in first litters were subsequently mated to provide the mean 
litter size. Only first litters were reared, so lines became eliminated when they failed 
to produce one offspring of each sex in their first litters. No other selection between 
lines was applied. Selection within lines was applied in 10 of the lines (these are not 
inlcuded in Fig. 14.2). This selection had virtually no effect on the inbreeding depres¬ 
sion. At generation 3 (F — 55%), 1 of the 20 lines was lost, and at generation 6 (F 
= 76%), only 3 lines remained. Over the next 5 generations, these 3 had a mean litter 
size of 6.6, an improvement of over 2 mice per litter attributable to selecting the best 
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3 out of 20 lines. One of the lines survived indefinitely and became a ‘standard’ inbred 
strain known as JU. Over the three years after it had had more than 20 generations of 
inbreeding, its mean litter size was 9.0 in first litters (McCarthy, 1965), which was bet¬ 
ter than the original population before inbreeding. The experiment is described by Bowman 
and Falconer (1960) and Falconer (1960a). Similar results have been obtained in two 
other experiments with mice (Falconer, 1971; Eklund and Bradford, 1977). 

The three experiments cited in Example 14.1 prove that by selection it is possible 
to get highly inbred lines of mice that are at least as good as the original population 
in respect of the character selected. This provides strong, though not conclusive, 
evidence that overdominant loci are not an important cause of inbreeding depres¬ 
sion of litter size in mice. If an overdominant locus is at its equilibrium gene fre¬ 
quency in the population before inbreeding, it is not possible to have a homozygous 
line that is as good as the non-inbred population. But if the frequency of the better 
homozygote is below its equilibrium value in the non-inbred population, fixing the 
better homozygote in an inbred line may increase the mean (Minvielle, 1979). The 
ways in which selection favouring heterozygotes affects the inbreeding depression 
are not straightforward. They depend on whether the gene frequencies start at their 
equilibrium values or not and, if they do, on whether the equilibrium frequency is 
intermediate or extreme, i.e., on whether the two homozygotes are nearly equal 
or very different in fitness (Hill and Robertson, 1968). When the initial frequency 
is the equilibrium value, selection reduces the inbreeding depression by delaying 
the approach to homozygosis if the two homozygotes are nearly equal in fitness; 
but it reduces the rate and the total amount of inbreeding depression if the two 
homozygotes are veiy different in fitness, because it then causes the better homozygote 
to be fixed preferentially. 


Heterosis 

Complementary to the phenomenon of inbreeding depression is its opposite, ‘hybrid 
vigour’ or heterosis. When inbred lines are crossed, the progeny show an increase 
of those characters that previously suffered a reduction from inbreeding. Or, in 
general terms, the fitness lost on inbreeding tends to be restored on crossing. The 
amount of heterosis is the difference between the crossbred and inbred means. That 
the phenomenon of heterosis is simply inbreeding depression in reverse can be seen 
by consideration of how the population mean depends on the coefficient of inbreeding, 
as shown in equation [14.4]. Consider, as before, a population subdivided into a 
number of lines inbred without selection so that the mean gene frequencies are not 
changed. If the lines are crossed at random, the average inbreeding coefficient in 
the crossbred progeny reverts to that of the base population and, if the gene fre¬ 
quencies have not changed, the frequencies of the genotypes are the same as in the 
base population. Thus if a number of crosses are made at random between the lines, 
the mean value of any character in the crossbred progeny is expected to be the same 
as the population mean of the base population. In other words, the heterosis on cross¬ 
ing is expected to be equal to the depression on inbreeding. Furthermore, if the 
population is continued after the crossing by random mating among the crossbred 
and subsequent generations, the coefficient of inbreeding will remain unchanged, 
and the population mean is consequently expected to remain at the level of the base 
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population. We may thus make the following generalization on theoretical grounds: 
that, in the absence of selection, inbreeding followed by crossing of the lines in a 
large population is not expected to make any permanent change in the population 
mean. 

Example 14.2 An experiment with mice (Roberts, 1960) was designed to test the 
theoretical expectation that, in the absence of selection, the heterosis on crossing should 
be equal to the depression on inbreeding. The character studied was litter size. Thirty 
lines taken from a random-bred population were inbred by 3 consecutive generations 
of full-sib mating, bringing the coefficient of inbreeding up to 50 per cent in the litters 
and 37.5 per cent in the mothers. No selection was practised during the inbreeding, and 
only 2 of the 30 lines were lost as a consequence of their inbreeding depression. 


F in 


litters mothers 


Mean 
litter size 


Before inbreeding 
Inbred 

Crossbred litters 
Crossbred litters and mothers 


0 0 8.1 

0.50 0.375 5.7 

0 0.50 6.2 

0 0 8.5 


After the third generation of inbreeding, crosses were made at random between the 
lines, and in the next generation crosses between the F,’s were made so as to give 
crossbred mothers with non-inbred young. The mean litter sizes observed at the different 
stages are given in the table. The inbreeding depression was 2.4 and the heterosis 2.8, 
the two are equal within the limits of experimental error. 


Single crosses 

The foregoing theoretical conclusions refer to the average of a large number of crosses 
between lines derived from a single base population. In practice, however, one is 
often interested in a somewhat different problem, namely the heterosis shown by 
a particular cross between two lines, or between two populations which may have 
no known common origin. To refer the changes of mean value to changes of in- 
breeding coefficient would be inappropriate under these circumstances, and the 
theoretical basis of the heterosis is better expressed in terms of the gene frequencies 
in the two lines. We may recall from Chapter 3 that inbreeding leads to a dispersion 
of gene frequencies among the lines, the lines becoming differentiated in gene fre¬ 
quency as inbreeding proceeds; and the coefficient of inbreeding is a means of expres¬ 
sing the degree of differentiation (equation [3.14]). In turning from the inbreeding 
coefficient to the gene frequencies as a basis for discussion we are therefore turning 
from the general, or average, consequence of crossing, to the particular circumstances 
in two lines. 

Let us, then, consider two populations, referred to as the parent populations , 
both random-bred though not necessarily large. The parent populations are crossed 
to produce an Fj or ‘first crossbred generation’, and the F t individuals are mated 
together at random to produce an F 2 or ‘second crossbred generation’. The amount 
of heterosis shown by the F, or the F 2 will be measured as the deviation from the 
mid-parent value, i.e., as the difference from the mean of the two parent popula¬ 
tions. First consider the effects of a single locus with two alleles whose frequencies 
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are p and q in one population, and p ' and q' in the other. Let the difference of 
gene frequency between the two populations be y, so that y = p - p' = q' - q. 
The algebra is then simplified by writing the gene frequencies p' and q ' in the 
second population as (p — y) and (q + y). Let the genotypic values be a, d, —a , 
as before. They are assumed to be the same in the two populations, epistatic inter¬ 
action being disregarded. We have to find the mean of each parent population and 
the mid-parent value; then the mean of the F,, and the mean of the F 2 . The paren¬ 
tal means, Af P| and Mp 2 , are found from equation [7.2]. They are 
■ 

M Pt = a(p — q) + 2 dpq 

M ?2 = a(p - y - q -y) + 2 dip - y){q + y) 

= aip - q - 2y) + 2 d[pq + yip - q) - y 2 ] 

The mid-parent value is 

Mp = i (A/ P| + M Pl ) 

= a(p - q - y) + d[2pq + y(p - q) - y 2 ] ... [14.5] 

When the two populations are crossed to produce the F b individuals taken at ran¬ 
dom from one population are mated to individuals taken at random from the other 
population. This is equivalent to taking genes at random from the two populations, 
as shown in Table 14.3. The F] is therefore constituted as follows: 



Genotypes 





a,a 2 

a 2 a 2 

Frequencies 

pip - y ) 

2 pq + yip - q) 

q(q + y) 

Genotypic values 

a 

d 

— a 


The mean genotypic value of the F t is therefore: 

M P| = a{p 2 - py - q 2 - qy) + d[2pq + yip - q)] 

= aip - q - y) + d[2pq + yip - q)] ... [14.6] 

The amount of heterosis, expressed as the difference between the F( and the mid¬ 
parent values, is obtained by subtracting equation [14.5] from equation [14.6]: 

H Fl = M P| — Mp 

= dy 2 ...[14.7] 

Thus heterosis, just like inbreeding depression, depends for its occurrence on 


Table 14.3 Frequencies of zygotes in the F,. 



Gametes from P| 


■A-i 

A 2 


P 

q 

Gametes 

A i p - y pip - y) 

qip - y) 

from P 2 

a 2 q + y piq + y) 

qiq + y) 


Heterosis 


257 


dominance. Loci without dominance (i.e., loci for which d = 0) cause neither in- 
breeding depression nor heterosis. The amount of heterosis following a cross be¬ 
tween two particular lines or populations depends on the square of the difference 
of gene frequency (y) between the populations. If the populations crossed do not 
differ in gene frequency there will be no heterosis, and the heterosis will be greatest 
when one allele is fixed in one population and the other allele in the other population. 

Now consider the joint effects of all loci at which the two parent populations differ. 
In so far as the genotypic values attributable to the separate loci combine additively, 
we may represent the heterosis produced by the joint effects of all the loci as the 
sum of their separate contributions. Thus the heterosis in the Fj is 

H Fl = Zdy 2 ... [14.8] 

Three conclusions can be drawn from equation [14.8]: 

(1) If some loci are dominant in one direction and some in the other, their effects 
will tend to cancel out, and no heterosis may be observed, in spite of the dominance 
at the individual loci. The occurrence of heterosis on crossing is therefore, like in- 
breeding depression, dependent on directional dominance, and the absence of heterosis 
is not sufficient ground for concluding that the individual loci show no dominance. 

(2) The amount of heterosis is something specific to each particular cross. The 
genes by which two lines differ will not be the same for all pairs of lines, so dif¬ 
ferent pairs of lines will have different values of H,dy 2 and will show different 
amounts of heterosis. 

(3) If the lines crossed are highly inbred, and so completely homozygous, the dif¬ 
ference of gene frequency between them can only be 0 or 1. The heterosis as shown 
by equation [14.8] is then the sum of the dominance deviations d of those loci that 
have different alleles in the two lines. 

Before we go on to consider the F 2 it is perhaps worth noting that the formula¬ 
tion of the heterosis in terms of the square of the difference of gene frequency, in 
equations [14.7] and [14.8], is quite in line with the previous formulation of the 
inbreeding depression in terms of the coefficient of inbreeding. If we think of a popula¬ 
tion subdivided into many lines, and we suppose that pairs of lines are taken at ran¬ 
dom, then the mean squared difference of gene frequency between the pairs of lines 
will be equal to twice the variance of gene frequency among the lines, i.e., y = 
2a 2 . The relationship between the variance of gene frequency and the coefficient 
of inbreeding was given in equation [3.14] as a 2 = pqF. Therefore= 2 pqF, 
showing that the mean amount of heterosis in crosses between random pairs of lines 
is equal to the inbreeding depression as given in equation [14.2], though of opposite 
sign. 

Now consider the F 2 of a particular cross of two parent populations, the F 2 , being 
made by random mating among the individuals of the Fj. In consequence of the 
random mating, the genotype frequencies in the F 2 will be the Hardy—Weinberg 
frequencies corresponding to the gene frequency in the F]. The mean genotypic 
value of the F 2 is then easily derived by application of equation [7.2]. The gene 
frequency in the F l5 being the mean of the gene frequencies in the two parent 
populations, is (p - j y) for one allele, and (q + j y) for the other. Putting these 
gene frequencies in place of p and q respectively in equation [7.2] gives the mean 
genotypic value of the F 2 as 
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m f 2 = a(p - ky - q -ky) + 2 dip - \y)(q + ]y) 

= a(p - q - y) + d[2pq + yip - q) - ±y 2 ] . .. [14.9] 

The amount of heterosis shown by the F 2 is the difference between the F 2 and mid¬ 
parent values. So, from equations [14.5] and [14.9], 

H F2 = Mp 2 — Mp 

= kdy 1 

= fH Fl ... [14.10] 

We Tind therefore that the heterosis shown by the F 2 is only half as great as that 
shown by the Fp In other words, the F 2 is expected to drop back half-way from 
the p! value toward the mid-parent value. At first sight this conclusion may seem 
to contradict the one arrived at earlier, when we were considering crosses between 
many lines, the Fj and F 2 means then being equal. The difference between the two 
situations is that an F 2 made by random mating among a large number of different 
crosses has the same inbreeding coefficient as the F t . But an F 2 made from an Fj 
derived from a single cross has inevitably an increased inbreeding coefficient. If 
the inbreeding coefficient is worked out in the manner described in Chapter 5, it 
will be found to be half the inbreeding coefficient of the parent lines. The change 
of mean from Fj to F 2 may therefore be regarded as inbreeding depression. It can¬ 
not be overcome by having a large number of parents of the F 2 because the restric¬ 
tion of population size that causes the inbreeding has already been made in the single 
cross of only two lines, or parent populations. There need, however, be no further 
rise of the inbreeding coefficient in the F 3 and subsequent generations. Provided, 
therefore, that there is no other reason for the gene frequency to change, the popula¬ 
tion mean will be the same in the generations following as in the F 2 . 

That the heterosis expected in the F 2 is half that found in the F, is equally true 
when the joint effects of all loci are considered, provided that epistatic interaction 
is absent. The conclusion for a single locus was based on the principle that Hardy— 
Weinberg equilibrium is attained by a single generation of random mating. This, 
however, is not true with respect to genotypes at more than one locus considered 
jointly, for reasons that were explained in Chapter 1. Threfore if there is epistatic 
interaction, the population mean will not reach its equilibrium value in the F 2 , but 
will approach it in subsequent generations more or less rapidly according to the 
closeness of the linkage between the interacting loci. The existence of epistatic inter¬ 
action is intimately connected with the scale of measurement, but this matter will 
not be discussed until Chapter 17. Here we need only note that, for reasons con¬ 
nected with the scale of measurement, the halving of the heterosis in the F 2 expected 
on theoretical grounds is not often found at all exactly in practice, though the F 2 
usually falls somewhere between the F] and mid-parent values. Some examples 
from plants of the heterosis observed in the F, and F 2 generations are illustrated 
in Fig. 14.3. It will be noticed that with some of the characters shown the Fj and 
F 2 are lower in value than the mid-parent, and the heterosis is consequently negative 
in sign. This is in no way inconsistent with our definition of heterosis as the dif¬ 
ference between the Fj or F 2 and the mid-parent value. The sign of the difference 
depends simply on the nature of the measurement. For example, the character ‘days 
to first fruit’, represented in graphs ( e ) to (h), shows heterosis of negative sign: 
but if the character were called ‘speed of development’ and expressed as a reciprocal 
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r, r 2 r r, 1 2 i » 1 ‘2 - *1 

(e) (J) (?) ( h ) 


Fig. 14.3. Some illustrations of heterosis observed in crosses between pairs of highly inbred 
strains of plants. The points show the mean values of the two parent strains, the F x and the F 2 
generations. The mid-parent values are shown by horizontal lines. Graph (a) refers to tobacco, 
Nicotiana rustica. {Data from Smith, 1952). All the other graphs refer to tomatoes, Lycopersicon 
{Data from Powers, 1952). The characters represented are: 

(a) Height of plant (in) 

{b) Mean weight of one fruit (g) 

(c) Number of locules per fruit 
{d) Mean weight per locule (g) 

{e)-{h) Mean time in days between the planting of the seed and the ripening of the first fruit, in 
4 different crosses. 

of time the heterosis would be positive in sign. Not all the crosses in Fig. 14.3 pro¬ 
vide heterosis that would be useful to a breeder. If the mean of the character is the 
only criterion of value, a cross is of no use unless the Fj is better than both of the 
two parental lines. The term heterosis is sometimes used to mean useful heterosis, 
that is to say the amount by which the F t exceeds the better parent line. 

Maternal effects. The relative amount of heterosis observed in the F t and F 2 
generations may be complicated by maternal effects. A character subject to a mater¬ 
nal effect, such as litter size, has two components belonging to different genera¬ 
tions. Each component is expected to follow the general pattern of heterosis in the 
F! and F 2 described above, but the two components are one generation out of step 
with each other. Thus the heterosis observed in the Fj is attributable to the non- 
maternal part, the maternal effect being still at the inbred level. In the F 2 , however. 
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the non-maternal part will lose half the heterosis as explained above, but the mater¬ 
nal effect will now show the full effect of its heterosis since the mothers are now 
in the F, stage. This rather complicated situation may perhaps be more readily 
grasped from the diagrammatic representation in Fig. 14.4. This shows how the 
heterosis appears in two steps, the first due to the non-maternal component and the 
second due to the maternal component. These two steps were illustrated in Example 
14.2. 

Epistasis. We have seen that the amount of heterosis shown by a particular cross 
depends, among other things, on the differences of gene frequency between the two 
populations crossed. This would seem to indicate that the amount of heterosis would 
increase with the degree of genetic differentiation between the two populations and 
would be limited only by the barrier of interspecific sterility. This, however, is not 
true. Populations that are widely differentiated through adaptations to local condi¬ 
tions may fail to show heterosis and may suffer a reduction of fitness in the F 2 
generation, as has been shown by studies of Drosophila populations (Wallace and 
Vetukhiv, 1955). The failure of wide crosses to show the heterosis that might have 
been expected can be attributed to epistatic interaction, which we have so far assumed 
to be absent. Adaptation to widely different local conditions involves many different 
characters because the fitness of an organism depends on the harmonious interrela¬ 
tions of all its functions. Genes at many loci are therefore selected for their joint 
effects on fitness; the combinations of genes selected in this way are said to be 
‘coadapted’. In other words, some of the adaptation comes from epistatic interactions. 
When two populations adapted to different conditions are crossed, the hybrids are 
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Generation 

-Character as measured 

.Non-maternal component 

.Maternal component 

Fig. 14.4. Diagram of the heterosis expected in a character subject to a maternal effect, when two 
lines are crossed and the F 2 and subsequent generations are made by random mating. The 
maternal and non-maternal components of the character separately are here supposed to show 
equal amounts of heterosis, and to combine by simple addition to give the character as it is 
measured. 
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Fig. 14.5. Relationship between heterosis and width of cross as shown by yield of maize, as 
explained in Example 14.3. The heterosis is expressed as the percentage difference from the mid¬ 
parent. The mid-parent yields in g per plant are shown in the lower graph. {Data from Moll et al., 
1965.) 

adapted to neither. Furthermore, the favourable epistatic combinations of genes are 
lost by segregation in the F 2 . The role of epistasis in relation to heterosis is sum¬ 
marized by Mayo (1987, p. 148) and is reviewed by Geiger (1988). Figure 14.5 
and Example 14.3 show how heterosis for yield in maize is related to the width of 
the cross. 

Example 14.3 (Data from Moll et al, 1965). Varieties, or populations, of maize are 
adapted to different geographical regions. Two varieties from each of four regions were 
chosen and all of the 28 possible crosses were made. F 2 generations were made by ran¬ 
dom mating among the individuals of each F,. The varieties and the crosses were grown 
in each of the four regions of origin. The regions were: (A) Southeast USA, (B) Midwest 
USA, (C) Puerto Rico, (D) Mexico. The degree of diversity in the crosses was assessed 
in seven grades according to the known ancestral relationships of the varieties and the 
climatic conditions of the regions. Grade 1 was crosses between the varieties from the 
same region. The other grades were: 2 = AxB;3=AxC;4 = BxC;5 = 
AxD;6 = BxD;7 = CxD. Figure 14.5 shows the heterosis in the F,’s and F 2 ’s 
as percentages of the mean of the parental varieties, plotted against the degree of diver¬ 
sity. The mean parental yields are also shown. The greater heterosis was shown by the 
crosses of intermediate ‘width’. The widest crosses gave much less heterosis though it 
was still positive even in the F 2 generations. 
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Problems 

14.1 Suppose that a random-bred population of maize had a mean yield of 140 g 
per plant, and that four loci with known effects on yield were segregating. What 
would be the inbreeding depression caused by these loci after one generation of self- 
pollination, if the gene effects were as given below? The gene effects in each case 
are the differences in yield (g per plant) between the genotype listed and the ‘AA’ 
homozygote; the gene frequency is that of the ‘a’ allele. 


Locus 

Difference from AA 


Aa 

aa 

4a 

(1) 

-10 

-20 

0.5 

(2) 

+ 5 

-30 

0.5 

(3) 

-20 

-30 

0.2 

(4) 

0 

-60 

0.1 


[Solution 69] 

14.2 If inbreeding with selection in the maize population specified in Problem 14.1 
succeeded in fixing the more favourable allele at each of the four loci, by how much 
would the yield be increased? 


[Solution 79] 

14.3 From the data on mouse litter size in Example 14.2 calculate how much in- 
breeding depression of litter size results from inbreeding in the mother and how 
much from inbreeding in the litter. Assume that both are linear with respect to F, 
and that the maternal and litter effects combine additively. What would be the 
predicted litter size if mice could be inbred to F = 100 per cent without any selec¬ 
tion operating? 


[Solution 89] 

14.4 Use the results of Problem 14.3 to predict the total inbreeding depression 
of litter size when the inbreeding coefficient of the mothers is 0.56 and that of the 
litters is 0.64. These were the inbreeding coefficients in the last generation of the 
experiment depicted in Fig. 14.2(a). This was a different experiment from that of 
Example 14.2 used in Problem 14.3. How well do the two experiments agree? 

[Solution 99] 

14.5 A control line of mice was kept for 30 generations and its litter size showed 
no evidence of inbreeding depression in spite of its effective population size being 
not greater than 40. One possible reason for the absence of inbreeding depression 
is that there may have been some inadvertent selection for litter size. The mating 
system was intended to be minimal inbreeding, one young female being taken at 
random from each family. In theory no selection was possible, but minimal inbreeding 
cannot be applied strictly in practice because some families contain no surviving 
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female. Replacements have then to be taken from other families, and some selection 
can be caused by these replacements. How much selection would have been needed 
to counteract inbreeding depression at the rate expected from the solution of Prob¬ 
lem 14.3? The selection resulting from replacements would have been individual 
selection, and the realized heritability appropriate to individual selection was estimated 
in the same strain to be 0.22. 

Data from Falconer, D. S. (1960)7. Cell. Comp. Physiol , 56 (Suppl. 1), 153-67. 

[Solution 109] 

14.6 Crosses were made between varieties of cultivated tomatoes, which are 
normally self-fertilizing. The mean weights (kg) of fruit produced per plant in a 
three-week period by the parental varieties and the Fj in two crosses were as 
follows. 

Cross P] P 2 F| 

(1) 1.44 1.36 1.41 

(2) 1.28 0.88 1.42 

What would be the predicted yields of the F 2 generations, and of the F 3 genera¬ 
tions produced by selfing the F 2 ? 

Data from Williams, W. (1960) Genetics, 45 , 1457—65. 

[Solution 119] 

14.7 In Problem 6.1 the mean leaf numbers in the F] and F 2 generations of a cross 
of two varieties of tobacco were calculated. The mean of the F[ was 15.72 ± 0.24 
and the mean of the F 2 was 15.84 ± 0.49. Without knowing the leaf numbers in 
the parental varieties, what can you conclude about the heterosis shown by the Fj? 

[Solution 129] 



1C INBREEDING AND CROSSBREEDING: 
II. Changes of variance 


The effect of inbreeding on the genetic variance of a metric character is apparent, 
in its general nature, from the description of the changes of gene frequency given 
in Chapter 3. Again, we have to imagine the whole population, consisting of many 
lines. Under the dispersive effect of inbreeding, or random drift, the gene frequen¬ 
cies in the separate lines tend toward the extreme values of 0 or 1, and the lines 
become differentiated in gene frequency. Since the mean genotypic value of a metric 
character depends on the gene frequencies at the loci affecting it, the lines become 
differentiated, or drift apart, in mean genotypic value. And, since the genetic com¬ 
ponents of variance diminish as the gene frequencies tend toward extreme values 
(see Fig. 8.1), the genetic variance within the lines decreases. The general conse¬ 
quence of inbreeding, therefore, is a redistribution of the genetic variance; the com¬ 
ponent appearing between the means of lines increases, while the component ap¬ 
pearing within the lines decreases. In other words, inbreeding leads to genetic dif¬ 
ferentiation between lines and genetic uniformity within lines. 

The subdivision of an inbred population into lines introduces an additional obser¬ 
vational component of variance, the between-line component, and it is not surpris¬ 
ing that this adds a considerable complication to the theoretical description of the 
components of genetic variance. Only a brief description of the main outlines will 
be given here. For detailed treatment see Wright (1969), and Weir and Cockerham 
(1977). Similarly, when lines are crossed, the variance can be partitioned into com¬ 
ponents between and within crosses. The variance of crosses will be described at 
the end of this chapter. The redistribution of genetic variance is not the only effect 
of inbreeding; experiments have shown that the environmental variance is 
sometimes also affected. The greater sensitivity of inbred individuals to environmental 
sources of variation was mentioned earlier, in Chapter 8. This phenomenon interferes 
with the experimental study of the changes of variance, and until it is better understood 
we cannot put much reliance on the theoretical expectations concerning the genetic 
variance being manifest in the observed phenotypic variance. Another matter con¬ 
cerning inbreeding to be considered is the genetic stability of highly inbred lines, 
which is important in connection with the use of ‘standard’ inbred strains for 
experimental purposes. 
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Inbreeding 

Redistribution of genetic variance 

The redistribution of variance arising from additive genes (i.e. genes with no 
dominance) is easily deduced. This is because with additive genes the proportion 
in which the original variance is distributed within and between lines does not depend 
on the original gene frequencies. When there is dominance, however, we cannot 
deduce the changes of variance without a knowledge of the initial gene frequencies. 
This not only adds to the mathematical complexity, but it renders a general solution 
impossible. We shall first consider the case of additive genes, and then very briefly 
indicate the conclusions arrived at for dominant genes. The effect of selection will 
not be specifically discussed. We need only note that natural selection will tend to 
render the actual state of dispersion of gene frequencies less than that indicated by 
the inbreeding coefficient computed from the population size or pedigree relationships. 

No dominance. What follows refers to the variance arising from additive genes: 
it does not apply to the additive variance arising from genes with dominance. The 
conclusions therefore apply, strictly speaking, only to characters with no non-additive 
variance. They serve, however, to indicate the general effect of inbreeding on 
variance, and may be taken as a fair approximation to what is expected of characters 
that show little non-additive genetic variance. The description to be given refers 
to slow inbreeding, which means that the inbreeding coefficient can be taken, without 
too much error, to be the same in two consecutive generations. The redistribution 
of the variance under rapid inbreeding is, however, not very different except in the 
first few generations. 

Consider first a single locus. When there is no dominance the genotypic variance 
in the base population, given in equation [8.5], is 

V G = V A = 2p 0 q^ 2 

The variance within any one line is V G = 2pqa 2 , where p and q are the gene 
frequencies in that line. The mean variance within lines is 

V G w = 2 (pq)a 2 

where (pq ) is the mean value of pq over all lines. Now, 2(pq) is the overall fre¬ 
quency of heterozygotes in the whole population, which, by Table 3.1, is equal to 
2p 0 q 0 (1 - F), where F is the coefficient of inbreeding. Therefore 

VGw = 2p 0 qtfi 2 ( l - F) 

= V G {\ -F) 

and this remains true when summation of the variances is made over all loci. Thus 
the within-line variance is (1 - F) times the original variance, and as Fapproaches 
unity the within-line variance approaches zero. 

Now consider the between-line variance. This is the variance of the true means 
of lines, and would be estimated from an analysis of variance as the between-line 
component. For a single locus, still with no dominance, the mean genotypic value 
of a line with gene frequency p and q is obtained from equation [7.2] as M = 
a(p - q) = a(l - 2q). Thus we want to find the variance of ( a - 2qa). Epistasis 
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Table 15.1 Partitioning of the variance in a population with inbreeding coefficient F, 
when the genetic variance V G in the base population is due to genes with no dominance; 
/ is the coancestry of individuals in the same line. (1) Slow inbreeding, when 
F ; + , = F, approximately. (2) Rapid inbreeding. 



(1) 

(2) 

Between lines 

2 FV g 

2/ V G 

Within lines 

(1 ~F)V g 

(1 + F - 2f) V G 

Total 

(1 + F)V C 

(1 + F)V g 


is assumed to be absent, so a is constant, i.e. the same in all lines. Therefore 

= 0 2 (2aq) = ^a 2 a 2 q = 4a 2 pQqoF 

(from equation [3.14]), which equals 2 FV G . This remains true when summation is 
made over all loci. Thus the between-line genetic variance is 2 F times the genetic 
variance in the base population. 

The partitioning of the genetic variance into components as explained above is 
summarized in Table 15.1, column (1). When inbreeding is rapid the components 
are as in column (2) for reasons explained by Crow and Kimura (1970, p. 138). 
Here / is the coancestry of individuals of the same line, and this is equal to the 
inbreeding coefficient in the next generation if individuals were mated at random 
within lines. From Table 15.1 we see that the total genetic variance in the whole 
population is the sum of the within-line and between-line components, and is equal 
to (1 +F) times the original genetic variance. Thus when inbreeding is complete the 
genetic variance in the population as a whole is doubled, and all of it appears as 
the between-line component. The increasing variance between lines is illustrated from 
experimental data in Fig. 15.1. 
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Fig. 15.1. Differentiation between lines by random drift, shown by abdominal bristle number in 
Drosophila melanogaster. The graphs show the mean bristle number in each of 10 lines during 
full-sib inbreeding without artificial selection. {After Rasmuson, 1952.) 
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The genetic variance within lines, before inbreeding is complete, is partitioned 
within and between the families of which the lines are composed. Under slow 
inbreeding with random mating within the lines, it is partitioned equally within and 
between full-sib families. The covariance of relatives within the lines is just as 
described in Chapter 9, each line being a separate random-breeding population 
with a total genetic variance of (1 - F) V G , on the average. From this we can 
deduce what the heritability is expected to be within any one line. It will be 
(1 - F)V g /[(\ - F)V g + V E \, and this reduces to 


h] = 


hl(\ - F,) 

1 -hlF, 


[ 15 . 1 ] 


where hj and F t are the heritability within lines and the inbreeding coefficient at 
time r, and hi is the original heritability in the base population. This shows how 
the heritability is expected to decline with the inbreeding in a small population. The 
formula, however, is applicable only to characters with no non-additive variance, 
and in the absence of selection. The operation of natural selection renders the reduc¬ 
tion of the heritability less than expected, especially under slow inbreeding. This 
point has been demonstrated experimentally with Drosophila (Tantawy and Reeve, 
1956). 


Dominance. The changes in the components of variance arising from additive genes 
will have been seen to be independent of the gene frequencies in the base popula¬ 
tion. When we consider genes with any degree of dominance, however, we find 
that the changes of variance on inbreeding depend on the initial gene frequencies, 
and this makes it impossible to give a general solution in terms of the genetic variance 
present in the base population. We shall therefore do no more than give the conclu¬ 
sions arrived at by Robertson (1952) for the case of fully dominant genes, when 
the recessive allele is at low frequency. This is the situation most likely to apply 
to variation in fitness arising from deleterious recessive genes, though the effects 
of selection are here disregarded. Figure 15.2 shows the redistribution of variance 
arising from recessive genes at a frequency of q = 0.1 in the base population. Figure 
15.2(a) refers to full-sib mating with only one family in each line, and Fig. 15.2(6) 
refers to slow inbreeding. A surprising feature of the conclusions is that the within- 
line variance at first increases, reaching a maximum when the coefficient of inbreeding 
is a little under 0.5, and it remains at a fairly high level until the coefficient of 
inbreeding approaches 1. The reason, in general terms, for the apparent anomaly 
that the variation within lines increases during the first stages of inbreeding can be 
seen from a consideration of the relationship between the gene frequency and the 
variance arising from a dominant gene shown in Fig. 8.1(6). The gene frequency 
is taken to start at a value of 0.1, and on inbreeding it will increase in some lines 
and decrease in others, the increase being on the average equal in amount to the 
decrease. But examination of the graph shows that an increase of gene frequency 
by a certain amount will increase the variance more than a decrease of the same 
amount will reduce it. Therefore, on average, the variance within the lines will in¬ 
crease in the early stages of inbreeding. This increase of variance would be detect¬ 
able in practice only if a substantial part of the genetic variance were due to recessive 
genes at low frequencies. 
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Fig. 15.2. Redistribution of variance arising from a single fully recessive gene with initial 
frequency q 0 = 0.1. (a) with full-sib mating, (b) with slow inbreeding. (After Robertson, 1952.) 

V t = total genetic variance. 

V h = between-line component. 

V w = within-line component. 

V a = additive genetic variance within lines. 

Environmental variance 

Several times in previous chapters we have referred to the fact that the environmen¬ 
tal component of variance may differ according to the genotype; in particular, that 
inbred individuals often show more environmental variation than non-inbred 
individuals. This fact has been revealed by many experiments in which the variances 
of inbreds and of hybrids have been compared. If highly inbred lines are more variable 
than the F x cross between them (i.e. the ‘hybrid’) this difference must be attributed 
to the environmental component of variance, because the genetic variance is negligible 
in amount in the hybrids as well as in the inbred lines. The greater susceptibility 
of inbreds than of hybrids to environmental sources of variation is not a universal 
phenomenon, but it has been observed in a wide variety of characters and organisms. 
Some examples are cited in Table 15.2; others are given by Lemer (1954) and Wright 
(1977). The phenomenon has been extensively studied in behavioural characters of 
rats and mice; these have been reviewed by Hyde (1973), who found that 14 out 
of 19 behavioural characters showed the phenomenon. 

Before we consider the possible reasons for the differences of environmental 
variance, there are two points in connection with the phenomenon that should be 
mentioned. The first is a technical matter. If the mean of the character differs be¬ 
tween inbreds and hybrids, as it frequently does, then it may be difficult to decide 
on a proper basis for the comparison of the variances. It is necessary to be sure 
that any difference of variance found is not simply a reflection of the difference 
of mean. In other words, a measure of variation must be found that is uncorrelated 
with the mean. The coefficient of variation is often, though not always, an appropriate 
measure. The problem is basically a matter of the choice of scale, and will be con¬ 
sidered again in Chapter 17. The second point concerns the nature of the environmen¬ 
tal variation that is being measured. There is a distinction to be made between variation 
induced by the environment and adaptive responses by the organism to the particular 
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Table 15.2 Examples of characters with phenotypic variance greater in inbreds than 
in hybrids. The values are phenotypic variances averaged over the inbred lines and over 
the F,’s where more than one cross was made. (C.V.) 2 = squared coefficient of 
variation. 




Inbreds 

Hybrids 

Drosophila melanogaster 
(Robertson and Reeve, 1952) 




Wing-length: 6 inbreds, 6 F,’s. 
Afice(McLaren and Michie, 1956) 
Duration of ‘Nembutal’ anaesthesia 
(log minutes). 


2.35 

1.24 

2 inbreds, 1 F, 

Mice (Yoon, 1955) 

Age at opening of vagina (days). 


0.0665 

0.0165 

3 inbreds, 2 Ffs 

Mice (Chai, 1957) 


51.7 

17.4 

Weight (g) at ages given (C.V.) 2 

Birth 

119 

59 

2 inbreds, 1 Fj 

21 days 

98 

47 

Rats (Livesay, 1930) 

60 days 

24 

19 

Weight at 90 days (C.V.) , 3 inbreds, 2 F,’s 
Maize (Shank and Adams, 1960) 

522 

170 

Plant height (C.V.) 2 


44 

30 

Ear weight (C.V.) 2 

10 inbreds, 5 F,’s 


412 

198 


environmental circumstances. The distinction is necessary when one considers the 
possible relation of variation to fitness. In the first case, the presumption is that there 
is an optimal phenotype that is the same for all the environmental circumstances 
under consideration. The body temperature of mammals is an obvious example. 
Insensitivity to environmental variables is then an aspect of fitness: the fittest 
individuals are those that can regulate their development, or their physiological pro¬ 
cesses, so as to achieve the optimal phenotype despite sub-optimal values of the 
environmental variables. The restriction of variation in this way is called homeostasis. 
The environmental factors causing the variation are not necessarily external, but 
include also internal causes of developmental variation. In the second case, when 
the organism responds adaptively, the character has different optima in different 
environments. The output of sweat in relation to variation of ambient temperature 
is an example. With characters of this sort the relation with fitness is reversed: 
individuals with the ability to vary are fitter than those without the ability. 

What then is the cause of some characters being more variable in inbreds than 
in hybrids? The phenomenon can be looked on as a manifestation of inbreeding 
depression or heterosis (Mather, 1953). On this view, a character showing the 
phenomenon is one for which homeostasis is beneficial, and inbreeding has reduced 
this component of fitness, causing environmental variance to be increased in inbreds 
and correspondingly reduced when hybrids are compared with inbreds. This inter¬ 
pretation implies that there are genes that affect variability, i.e., homeostasis, 
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independently of any effect they may have on the mean of the character, and that 
these genes must have some degree of directional dominance for increased 
homeostasis. Another interpretation of the phenomenon is in biochemical terms. Dif¬ 
ferent allelic forms of enzymes often have their maximal enzymic activity at dif¬ 
ferent values of environmental variables such as temperature or pH. Heterozygotes, 
with both forms of the enzyme, may therefore maintain an adequate level of en¬ 
zyme activity over a wider range of environmental variation than homozygotes, with 
only one form of the enzyme, can do. In so far as the enzyme activity is reflected 
in a measured character, then, heterozygotes are less sensitive to environmental 
variables than are homozygotes. On this view, the stability is not necessarily related 
to fitness, but is simply a biochemical consequence of heterozygosity at some loci 
that affect the mean of the character. These loci are consequently overdominant with 
respect to their effect on variability, though not necessarily in their effect on the 
mean of the character or on fitness. 

Uniformity of inbred strains 

Inbred strains of laboratory animals, particularly mice, are widely used as experimen¬ 
tal material in testing and assay, and in many other areas of biological research. 
The inbred strains are single lines and are used because uniformity is desired. For 
some purposes, when for example the absence of antigenic differences is necessary, 
it is genetic uniformity that is required. Abundant experience has shown that the 
highly inbred strains of laboratory mice fully satisfy this requirement of genetic uni¬ 
formity. For other purposes, however, it is not genetic uniformity alone that mat¬ 
ters, but phenotypic uniformity. The less variable the animals, the smaller the number 
that need be used to attain a given degree of precision in measuring their response 
to a treatment. The value of inbred animals in this respect therefore depends on how 
much of the variance of the character is genetic in origin, because only the genetic 
variance is removed by inbreeding. It depends also on how much the environmental 
variance is affected by inbreeding; the increased environmental variance, as 
exemplified in Table 15.2, may sometimes offset the reduced genetic variance so 
that an inbred strain is more variable phenotypically than a non-inbred stock. The 
way to obtain genetic uniformity without increased environmental variation is, of 
course, to use the F x of a cross between two inbred strains. This has the added ad¬ 
vantage that the Fj individuals are usually healthier, more viable and more fertile 
than the inbreds, though it does not reduce the cost of production since the inbreds 
have to be maintained as parents. Another point in connection with the use of inbred 
or Fj animals may be mentioned. An inbred strain or the F! of two inbred strains 
has a unique genotype; and that of an inbred, moreover, is one that cannot occur 
in a natural population. Testing the response to any treatment on one inbred strain 
or one hybrid is therefore testing it on one genotype. There may be appreciable dif¬ 
ferences of response between genotypes, and consequently results obtained with one 
strain or cross cannot be relied on to be applicable to other strains. 

Mutation 

We saw in Chapter 12 that mutation generates new variation of metric characters 
that is not negligible in the context of long-term selection. How much does mutation 
affect the conclusions reached about the consequences of inbreeding? Consider first 
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selectively neutral mutants at individual loci. The rate of origin of allelic differences 
was shown in Chapter 4 to be equal simply to the mutation rate to neutral alleles, 
irrespective of the population size. To repeat the argument in the present context: 
in a sib-mated line there are four representatives of each autosomal locus, so the 
rate of occurrence of mutations at a particular locus is 4 u, where u is the mutation 
rate per gamete per generation. But a mutation that has occurred has a one in four 
chance of becoming fixed. So the rate of allelic substitution per generation is equal 
to the mutation rate. Mutants that are not strictly neutral can become fixed, especially 
in very small populations. In Chapter 4 it was stated that a mutant is ‘effectively 
neutral’ if the coefficient of selection against it is less than lt2N e , which is about 
0.2 in a sib-mated line (see Chapter 5). Thus if, for example, the rate of mutation 
to alleles with coefficients of selection against them of less than about 20 per cent 
were 10" 5 , and if the total number of loci were 10,000, then in any one subline there 
would be, on average, 1 allelic substitution every 10 generations (10 -5 X 10 4 xr 
= 1; t = 10). With sib mating a new mutant is very soon fixed or lost, so the number 
of loci segregating at any one time is very small. Thus the conclusion drawn about 
the genetic uniformity of sib-mated lines is very little affected by mutation. The long¬ 
term constancy of inbred lines is, however, not absolute and mutation is not neglig¬ 
ible when we consider metric characters, as we shall now see. 

We cannot extend the foregoing argument to metric characters because we do not 
know the number of loci at which mutation affects the character, nor the distribu¬ 
tion of their effects. What matters, however, is the rate at which new variance is 
generated by mutation at all relevant loci, and some estimates of this rate are available. 
The variance generated by mutation in one generation may be called the mutational 
variance, symbolized by V m . This is most conveniently expressed as a proportion 
of the environmental variance V E . Estimates from various characters in several 
species show that on average V m is of the order of V E X 10 3 . (These estimates are 
reviewed by Lynch, 1988, and are considered again in Chapter 20.) This new varia¬ 
tion arising in each generation by mutation accumulates until the gain in each genera¬ 
tion is balanced by the loss from fixation due to the inbreeding. The genetic variance 
thereafter remains constant. With close inbreeding this constant state is attained after 
only a few generations (see Lynch and Hill, 1986). 

The amount of genetic variance lost in each generation can be deduced as follows. 
The proportion of the original variance remaining within lines is (1 - F) as given 
in Table 15.1. If we write F in terms of A F by equation [3.12] and A F in terms 
of N e by equation [4.1], and if we put t = 1, we obtain 1 - (1 /2N e ) as the propor¬ 
tion remaining after one generation. If we let V g be the existing variance in any 
generation then the amount lost in the next generation is V g /2N e . When this is 
equated to the new mutational variance, V m , we find 

V = 2N V 

r g ^ iy e Y m 

This is the constant amount of genetic variance within an inbred line. It refers to 
neutral genes with additive effects on the character, though non-additivity makes 
little difference (see Lynch and Hill, 1986). With sib mating N e = 2.6 (see Chapter 
5), and if we substitute this and the estimate given above for V m , we find V g = 
(5 X 10 _3 )K e (approx.). This may be more meaningful if expressed as the herit- 
ability, which is V g j ( V g + V E ) = 0.5 per cent. Again, therefore, mutation does 
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not affect the conclusion that sib-mated lines are for most practical purposes genetical¬ 
ly uniform. 

Subline divergence 

In contrast to its negligible effect on the variance within sib-mated lines, mutation 
does have important consequences if an inbred strain is split into sublines. The use 
of standard inbred strains facilitates the comparison of results from different 
laboratories. Dispersion to different laboratories inevitably leads to the strain being 
split into sublines and it is important to know the extent to which the sublines become 
differentiated by random drift acting on the mutational variance. 

If the strain was highly inbred before separation into sublines none of the original 
variance in the base population will be left and, furthermore, the variance originating 
from mutation will have reached its stable level of 2N e V m , and this will also be the 
stable level within sublines after their separation. The amount by which the variance 
between sublines increases in each generation is as follows. The variance between 
sublines is 2FV g (by Table 15.1). Putting F in terms of N e as was done above for 
the within-line variance gives the rate at which the variance between sublines increases 
per generation as V g /N e . Substituting V g = 2N e V m we find that the between-line 
variance increases by 2V m in each generation. (This rate is independent of the 
population size because larger populations have more within-line variance but dif¬ 
ferentiate more slowly by random drift.) If the sublines were separated t genera¬ 
tions previously then the variance of subline means will be 2 tV m because 2V m is the 
rate of differentiation per generation. This, of course, assumes that there are no 
environmental differences between the sublines. To appreciate the practical impor¬ 
tance of this differentiation we may express the variance between sublines, al, as 
a proportion of the phenotypic variance within sublines, aj,. We then have 

o\ = 2 tv m 

al 2 N,V m + V E 

If we substitute V m = iO~ 3 V E and N e = 2.6 we get 

a'i/al = (1.63 x 10" 3 )r 

Thus the variance between the means of sib-mated sublines separated, for example, 
25 generations previously will be about 5 per cent of the phenotypic variance within 
the sublines. Detectable, and even quite large, differences between sublines are 
therefore to be expected, and have in fact been found in several experiments, for 
example McLaren and Michie (1954), Grewal (1962), Festing (1973), and the 
experiment described in Example 15.1 below. For further details about mutation 
and subline divergence see Lynch and Hill (1986) and Lynch (1988). 

Example 15.1 This is a brief description of a study of two inbred mouse strains by 
Bailey (1959). Six measurements of various skeletal dimensions were studied in two 
sublines of the C57BL/6 strain and two sublines of the BALB/cAn strain. The strains 
had been separated into sublines after 30 and 78 generations respectively of full-sib mating, 
so differentiation by residual segregation was not a serious possibility. The average number 
of generations since the separation of the sublines was 46.5 in C57BL and 9 in BALB. 
The sublines of both strains had by then diverged a great deal in respect of some characters 
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and in both strains the sublines differed significantly in four of the six characters. The 
ratio o 2 b lo 2 w for the six characters in C57BL ranged from 0 to 1.27 with a mean of 0.41, 
and in BALB the ratio ranged from 0.08 to 0.52 with a mean of 0.23. By substituting 
the appropriate values into the expression given above for a\lo 2 w and rearranging, we 
can arrive at estimates of the mutational variance. The values of (V m /V E ) x 10 3 ranged 
from 0 to 15 with a mean of 5 in C57BL and from 4 to 34 with a mean of 14 in BALB. 
These estimates, however, are very imprecise because they are based on only two sublines 
c in each strain. 

Crossing 

In the first part of this chapter we saw how the genetic variance of a metric character 
is distributed between and within inbred lines. We now have to consider the com¬ 
plementary problem of how the variance is distributed between and within crosses. 
The variance between crosses is important in predicting what improvement can be 
expected from inbreeding and crossing; this will be explained in the next chapter. 
The variance after crossing presents a simpler problem than the variance after 
inbreeding for the following reason. The gametes produced by inbred lines are not 
different from the gametes produced by a non-inbred population, provided selection 
has not changed gene frequencies. This means that any F, individual has a genotype 
that could, in principle, be found in a random-breeding population; and, conversely, 
any genotype in the original random-breeding population could be found among the 
crosses. Consequently the total genetic variance after crossing is the same as that 
in the base population. The Fj individuals of the same cross can be regarded as a 
‘family’ with a degree of relationship dependent on the inbreeding coefficient of 
the parental lines. The covariance of these ‘families’ is the variance between crosses. 
If the parental lines are fully inbred, all members of the same Fj have identical 
genotypes and the variance between crosses is equal to the total genetic variance 
in the base population. If the parental lines are not fully inbred, the covariance of 
the ‘families’ can be worked out from the coefficients of relationship derived from 
coancestries. This will now be explained by consideration of random crosses. 

Variance between crosses 

Assume that a large number of lines have been inbred without selection from the 
same base population and all to the same inbreeding coefficient. Crosses are then 
made at random between the lines. Assume further that each cross is made from 
many individuals of the parent lines, these parental individuals not being related 
to each other within their lines. This means that the genetic variance within the lines 
is fully represented in the crosses. Each cross, then, is to be regarded as a family 
with a certain degree of relationship between its members. The component of variance 
between crosses is the covariance of these related individuals. The composition of 
the covariance of relatives was given in equation [9.13] in terms of the genetic com¬ 
ponents and the coefficients of relationship, r and u. The coefficient r concerns the 
additive variance and u the dominance variance. The coefficients r and u can be 
derived from the coancestries as shown by equations [9.11] and [9.12]. The 
coancestries concerned in a cross are as follows. The relevant individuals are shown 
in Fig. 15.3, where they are given letters to correspond with those in Fig. 5.4. A 
and C are two individuals of one parental line; B and D two individuals of the other 
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Line 1 Line 2 

A C B D 



P Q 


Fig. 15.3 

parental line. P and Q are two F, individuals produced by crossing A with B, and 
C with D, respectively. There are many other crosses producing other F,’s with 
pedigree relationships like P and Q. We need the covariance appropriate to the rela¬ 
tionship of P with Q expressed in the coancestry /pg. First, by equation [9.11], 
r = 2/pg, and by equation [5.2],= i (f AC + / AD + he + /bd)- The two paren¬ 
tal lines are unrelated, so / AD = / BC = 0. The two lines have the same inbreeding 
coefficient, so f AC = / BD = F, where F is the inbreeding coefficient of the next 
generation of the lines, i.e., the generation contemporaneous in terms of genera¬ 
tions with the Fj. Thus /pq = k F, and this gives 

r = F ... [15.2] 

Next, by equation [9.12], u = / A c/bd + /ad/bc> which reduces to 

u = F 2 ... [15.3] 

The covariance can now be expressed in terms of the inbreeding coefficient by 
substituting equation [15.2] and [15.3] into equation [9.13]. This gives the variance 
between crosses as 

<4 = FV a + F 2 V d + F 2 V m + F 3 V ad + F i V DD +.[15.4] 

In this expression V A and V D are the additive and dominance variances in the base 
population; V^, V AD and V DD are the interaction components as explained in 
Chapter 8; and F is the inbreeding coefficient, not of the parents used in the crosses, 
but of the next generation of the lines. The remainder of the genetic variance appears 
within lines, i.e. ( \ - F)V A + (1 - F 2 )V D , etc. The variance between crosses is 
the variance of the true means, which would be estimated as the component of variance 
from an analysis of variance. Equation [15.4] gives the genetic content of this com¬ 
ponent. It corresponds to what would be estimated experimentally only if there are 
no environmental differences between the crosses. If the variance of the observed 
means of crosses were to be calculated, this would be increased by a fraction of 
the within-cross variance depending on the number of individuals measured in each 
cross for the reasons explained in connection with family selection in Chapter 13 
(see Table 13.3). 

The main point of interest in equation [15.4] is that the contributions of the dif¬ 
ferent components of genetic variance are differently related to the inbreeding coef¬ 
ficient. The contribution of the additive variance increases linearly with F, but those 
of the dominance and interaction components increase in proportion to the square 
or higher powers of F. The consequence is that if the character is one with much 
non-additive variance, the crosses become differentiated much more rapidly in the 
late stages of inbreeding, as F approaches 1, than they do in the early stages. When 
F = 0.5, each cross is genetically equivalent to a full-sib family in the base 



Crossing 


275 


population; when F = 1, the whole of the genetic variance appears between crosses. 
The practical importance of the way in which the between-cross variance is related 
to F will be considered in the next chapter. 

Combining ability 

The variance between crosses was derived above on the assumption that a large 
pumber of lines were crossed at random. The ‘large number’ implies that each line 
was used in only one cross. If, in contrast, each line is crossed with several others, 
the variance between crosses can be partitioned in a way that has great importance 
for understanding the use of cross-breeding for improvement. We shall assume, for 
the sake of explanation, that large numbers of lines, crosses, and individuals are 
used, so that all means are estimated without error. Crossing a line to several others 
provides an additional measure of the line, i.e. the mean performance of the line 
in all its crosses. This mean performance, when expressed as a deviation from the 
mean of all crosses, is called the general combining ability of the line. It is the average 
value of all Fj’s having this line as one parent, the value being expressed as a devia¬ 
tion from the overall mean of crosses. Any particular cross, then, has an ‘expected’ 
value which is the sum of the general combining abilities of its two parental lines. 
The cross may, however, deviate from this expected value to a greater or lesser 
extent. This deviation is called the specific combining ability of the two lines in com¬ 
bination. In statistical terms, the general combining abilities are main effects and 
the specific combining ability is an interaction. The true mean Xof a cross between 
lines P and Q can thus be expressed as 

X - X = GCAp + GCAq + SC/ipQ . . .[15.5] 

where X is the mean of all crosses, and GCA and SCA are the general and specific 
combining abilities respectively. In practice another term, E, must be added to the 
right-hand side to represent sampling error in estimating X. 

The terms on the right-hand side of equation [15.5] are uncorrelated with each 
other, so the total between-cross variance (excluding error variance) is made up as 
follows: 

°x = °GC4(M) -I- 0gca(F) + °sca ■ • ■ [15.6] 

where (M) and (F) refer to the general combining abilities of lines used as male 
and as female parents respectively. If the lines are not distinguished by sex or in 
any other way, then equation [15.6] becomes 

ox = 2oqca + °sca • • • [15.7] 

The two components into which the total between-cross variance can be partitioned 
are the variance of general combining ability and the variance of specific combining 
ability. These are observational components of variance in the sense explained in 
Chapter 9, and are estimated from an analysis of variance. Their importance lies 
in the fact that the causal components of genetic variance contribute to them dif¬ 
ferently, as we shall now see. 

A set of crosses with one line as common male parent can be regarded as a family 
analogous to a paternal half-sib family. The covariance of these families is the variance 
of general combining abilities of male lines, ogca( M). This covariance is found 
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from the coefficients of relationship in the same way as before. Figure 15.3 will 
serve to illustrate the relevant pedigree. Individuals A and C are two members of 
die common male line, but B and D are now members of two different lines to which 
A and C are crossed, producing the two F^s, P and Q. The coancestries are now 
all zero except y^c = F. This gives r = \F and u = 0. Substitution into equation 
[9.13] as before gives 

4oi(M) = WV A + + . . . 

The same argument applies to lines used as female parents and therefore, provided 
there are no maternal or sex-linked effects, 0gca(F) = 0 gca(M). The variance of 
specific combining ability is what is left of the total variance between crosses as 
given in equation [15.4]. We thus arrive at the following composition of the com¬ 
ponents of variance of crosses: 

General combining ability 

of male parents: <*gca(M) = 2 FV A + j F 2 V AA + ... ^ 

of female parents: a gca(F) = \ FV A + 4- .. . >... [15.8] 

Specific combining ability: a^ C A = f 2 v d + f 3 ^ + f*v dd + J 

From this it can be seen that differences of general combining ability are due to 
the additive variance and A X A interactions in the base population; and differences 
of specific combining ability are attributable to the non-additive genetic variance. 
Consequently the variance of general combining ability increases linearly with F 
(apart from the interaction component), while the variance of specific combining 
ability increases with higher powers of F. It is therefore the specific, and not the 
general, combining ability that is expected to increase in variance more rapidly as 
the inbreeding reaches high levels. 

The components of genetic variance in equation [15.8] are those of a random¬ 
breeding population with all gene frequencies equal to those in the lines crossed and 
with coupling and repulsion linkages in equilibrium. This random-breeding popula¬ 
tion can be regarded as being the base population, real or hypothetical, from which 
the lines were derived without selection. Or, alternatively, it can be regarded as 
a synthetic population made by random mating among the crosses and then bred 
by random mating for long enough to reach linkage equilibrium. Which of these 
viewpoints is to be adopted affects the details of the analysis of variance. In the first 
case the lines are regarded as a sample of the population and are therefore random 
factors: in the second case the lines are the whole population and are fixed factors 
(see Griffing, 1956b, for details). 

Estimation of combining abilities. A method of estimating general combining 
abilities that is convenient for use with plants is known as the polycross method. 
A number of plants from all the lines to be tested are grown together and allowed 
to pollinate naturally, self-pollination being prevented by the natural mechanism for 
cross-pollination, or by the arrangement of the plants in the plot. The seed from 
the plants of one line are therefore a mixture of random crosses with other lines, 
and their performance when grown tests the general combining ability of that line. 
The general combining abilities measured are those of lines used as female parents. 
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If the variances of general combining ability are assumed to be the same for male 
and female parents, the variances of general and specific combining ability can be 
estimated and interpreted as in equation [15.8]. 

The general combining ability of a line can be estimated by crossing it with 
individuals from the base population instead of with other inbred lines. This method 
js known as top-crossing. It is equivalent to crossing with a random set of lines inbred 
from the base population without selection because, as noted earlier, the gametes 
from inbred lines are not different in genetic content from those of the base population. 

A commonly used experimental design for crossing inbred lines is the diallel cross, 
in which each line is crossed with every other line. The estimation of the general 
combining abilities of the lines is explained and illustrated in Example 15.2 below. 
The analysis of a diallel cross for the purpose of estimating variances is complicated 
because it depends on whether reciprocal crosses are included, and on the assump¬ 
tion made about the population to which the genetic components of variance refer. 
The theory and analytical procedures are explained by Griffing (1956a, b) and are 
evaluated by Pooni, Jinks and Singh (1984) and A. J. Wright (1985). 

Example 15.2 The calculation of general and specific combining abilities will be 
illustrated by data from Sprague and Tatum (1942) on crosses between inbred lines of 
maize. Ten lines were used and each was crossed with each of the other nine, but reciprocal 
crosses were not made. There were therefore \ n(n - 1) = 45 crosses, n being the 
number of lines. The yields, in bushels per acre, are given in table (i); these are the 
mean yields of each cross. (100 bushels per acre of maize — 6.725 tonnes per hectare.) 
The column headed T gives the total yield of each line in all nine crosses, obtained by 
summing down the appropriate column and along the row as indicated by the arrows. 
Note that each cross contributes to two totals, so LT = 2T.X, where X is the yield of a cross. 


Table (i) 


B 

C 

D 

E 

F 

G 

H 

i 

J 

T 

GCA 

A 86 

84 

98 

98 

92 

92 

97 

81 

88 

816 

3.75 

B 1-- 

91 

105 

102 

86 

92 

79 

80 

90 

811 

3.125 

C 

1_. 

87 

80 

65 

84 

93 

77 

83 

744 

-5.25 

D 


1-► 

97 

100 

101 

97 

91 

80 

856 

8.75 

E 



1-^ 

97 

83 

93 

78 

83 

811 

3.125 

F 




L 

- 80 

93 

76 

70 

759 

-3.375 

G 





L— 

90 

74 

72 

768 

-2.25 

H 






L 

-*• 91 

96 

829 

5.375 

I 







1—► 78 

726 

-7.50 

J 








1—* 

740 

-5.75 

Sums 







2E X = 

E T = 7860 

0.000 

Overall mean 







X = 87.333 




If there were a very large number of lines, the general combining ability of a line 
would be calculated simply as the deviation of its mean, 77(n - 1), from the overall 
mean, X. With a small number of lines, however, this is not valid because each of the 
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other lines contributes a fraction, \/(n - 1), of its general combining ability to the mean 
of the line in question. Thus, for example, the mean A of line A in all its crosses is 

1 

A - X - G A -\ -(G b + G c + ... + G n ) 

n - 1 

where the G’s are the general combining abilities of the lines A to N as indicated by 
the subscripts. Now, LG = 0, so (G B + G c + ... + G N ) = -G A , and so 



from which 

c ' - (irr) * 

It is more convenient to work with the totals than with the means. Substituting 
A = TJ{n - 1) and X = LT/n(n - 1) leads to 



n - 2 n(n - 2) 


The general combining abilities of the lines in the table are entered in the column headed 
GCA. Formulae appropriate to other designs of diallel cross are given by Simmonds 
(1979, p. 112). 

The ‘expected’ value of each cross can now be calculated, ‘expected’ meaning the 
value that would be predicted from the two general combining abilities, in the absence 
of any knowledge about the specific combining ability. To take the best cross, BD, as 
an example, E(X BD ) = 3.125 + 8.75 + 87.333 = 99.21. The difference between the 
observed and expected values estimates the specific combining ability of the two lines 
in combination: SC4 BD = 105 - 99.21 = +5.79. The value of the specific combining 
ability so obtained is subject to the sampling error in estimating ■^BD- Figure 15.4 shows 
a plot of the observed yields against the expected yields. If there were no deviations 
from expectation, the points would lie on the diagonal line with a slope of 1. The ver¬ 
tical distance of any point from the diagonal is the specific combining ability together 
with the sampling error of the yield of the cross. (If the lines were not highly inbred, 
there would also be error in estimating the specific combining abilities, this error being 
due to the sampling of genotypes from the lines.) 

For the purpose of illustration, the variances of general and specific combining ability 
will be calculated on the supposition that the lines were a random sample from a popula¬ 
tion of lines, though in fact they were not randomly selected. The analysis of variance 
for estimating the components is given in table (ii). The sums of squares for GCA and 

Table (ii) 


Source 

d.f. 

SS 

MS 

Expectation of MS 

GCA 

9 

2,179 

242.11 

a E + °SCA + 8ct GC4 

SCA 

35 

1,617 

46.20 

4 + 4ca 

Error 



5.36 

4 


2°g CA = 48.98 = 54.5% 
o 2 sca = 40.84 - 45.5% 
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SCA were calculated from the data in table (i); the mean square for error is the value 
stated in the paper. The variance of the true means of crosses, i.e., after deducting the 
variance due to error, is made up of 54.5 per cent due to general combining ability and 
45.5 per cent due to specific combining ability. The proportion attributed to specific 
combining ability is large because these lines were a selected group, and the variation 
in general combining ability among them was consequently less than would be expected 
in a random sample. 



Fig. 15.4. Observed and expected yield (bushels per acre) in crosses between ten lines of maize. 
The expected yield of each cross is the sum of the general combining abilities of the two parental 
lines. Deviations from the regression line are due to specific combining ability and error in 
estimating the mean yield of the cross. The two dashed lines show the positions of deviations 
from the regression of ± 2 standard errors of cross-means. (Data from Sprague and Tatum, 
1942.) 

Problems 

15.1 The genetic variance of abdominal bristle number in Drosophila is 
predominantly additive (see Table 8.2). Suppose that a large population is subdivided 
into replicate lines each bred from 20 parents, and that these lines are continued 
until the calculated inbreeding coefficient is 50 per cent. If the phenotypic variance 
in the base population was 4.0 and the heritability 52 per cent, what would be the 
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components of the phenotypic variance between lines and within lines? What would 
be the heritability (1) within lines and (2) overall, disregarding the subdivision into 
lines? 


[Solution 75] 

15.2 Suppose that the base population of Drosophila in Problem 15.1 is inbred 
for three generations by full-sib mating. What would be the expected response to 
selection carried out in the following manner? There are many lines, each consisting 
of a single full-sib family. Four flies, two of each sex, are taken at random from 
each line; the best 5 per cent of the lines are selected on the basis of the mean of 
the 4 flies sampled from them; the 4 flies from each of the selected lines are then 
mated with their sibs and the progeny are measured. 


[Solution 85] 

15.3 The body size of Drosophila, measured as thorax length, has little non-additive 
genetic variance (Table 8.2). Suppose that the heritability of thorax length was found 
to be 0.34 in a stock that was maintained by random mating among large numbers 
but had recently been started from a single pair of flies. What would be the estimate 
of the heritability in the population from which the original pair was taken? 

[Solution 95] 

15.4 Minor differences in the skeleton are common in mice. Twenty-seven 
characters of which variants occur were studied in sublines of the C57BL inbred 
strain. Two particular sublines were found to differ in respect of 5 of the 27 characters, 
and each difference was attributed to one mutational step. The strain had been inbred 
by full-sib mating for 40 or more generations before separation of the sublines. After 
separation, one of the sublines had a further 21 generations, and the other had 29 
generations, of full-sib mating at the time of the study. If the differences did arise 
from mutation, how would these results be interpreted in terms of the number of 
loci at which mutations can affect the characters and their mutation rates? 

Data from Deol, M.S. et al. (1975) J . Morph., 100, 345-76. 


[Solution 105] 

15.5 The following estimates of the parameters of individual plant yield were made 
in an open-pollinated variety of maize. 

Mean = 308 g; V P = 5,798; V A = 864; V D = 188 

If this population were inbred without any selection and the lines were crossed 
(1) when the inbreeding coefficient was 50 per cent, and (2) when the inbreeding 
coefficient was virtually 100 per cent, what would be the phenotypic components 
of variance between crosses and within crosses? Assume that Vj = 0, and that 
environmental differences between crosses were eliminated by the experimental 
design. 

If each cross mean were estimated from measurements of 20 individual plants, 
what would be the variance of the observed means of crosses? 
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Data from Gardner, C.O. (1977) pp. 475—89 in Poliak, E. et al. (eds), Proc. Int. 
Conf. Quantitative Genetics. Iowa State Univ., Ames, Iowa, USA. 

[Solution 115] 


15.6 Consider again the crosses in Problem 15.5. If 50 crosses were made at each 
stage, what would you predict the highest observed mean yield among the 50 crosses 
to be? If this best cross were then repeated, and a new set of plants from it were 
grown and measured, what would you predict the mean yield to be, assuming that 
there was no environmental difference between the first and second determinations 
of its yield? 

[Solution 125] 

15.7 A diallel cross, without reciprocals, was made between five varieties of the 
bean Phaseolus aureus, which is normally self-pollinating. The mean yields in grams 
per plant of the crosses are given below. The varieties are designated A to E and 
their yields are also given (on the diagonal), although these are not needed for this 
problem. Calculate (1) the general combining ability of each variety, and (2) the 
deviation from expectation of each cross, i.e. the specific combining ability + er¬ 
ror. Plot a scatter diagram like Fig. 15.4. 



A 

B 

C 

D 

E 

A 

9.7 

14.1 

22.8 

16.9 

31.8 

B 

_ 

3.3 

16.5 

6.2 

12.4 

C 

_ 

— 

9.0 

8.3 

9.2 

D 

_ 

_ 

— 

6.8 

13.1 

E 

— 

— 

— 

— 

12.5 


Data from Singh, K.B. & Jain, R.P. (1971) Theor. Appl Genet., 41, 279-81. 

[Solution 135] 




INBREEDING AND CROSSBREEDING: 
III. Applications 


The crossing of inbred lines to produce hybrids plays a major role in the improve¬ 
ment of some plants, most notably maize. Crossing is also widely used in animal 
breeding, though highly inbred lines of farm animals are not available because of 
the severe loss of fertility from inbreeding depression. Animal crosses are therefore 
made between mildly inbred lines or between different breeds. The principles underly¬ 
ing the use of inbreeding and crossing for improvement will be explained in this 
chapter. We shall be concerned mainly with outbreeding plants, but animals and 
naturally self-fertilizing plants will be considered briefly in separate sections. 
Technical details will not be given; for these the reader should consult a textbook 
of plant breeding, e.g., Simmonds (1979). Two simplifications will be made. First, 
it will be assumed that the only criterion of merit in plant breeding is yield, though 
in practice other characters have to be taken into consideration as well as yield. 
Second, the complications arising from genotype X environment interactions will 
not be discussed. Obviously an improved hybrid must perform well in a range of 
different environments associated with different years and different localities. It will 
be assumed that this requirement is included in the assessment of merit. Crossing 
highly inbred lines is used also as a method of genetic analysis both with plants and 
laboratory animals. These analyses of crosses and the later generations derived from 
them are fully described by Mather and Jinks (1977, 1982) and will not be dealt 
with here. 

Applied to outbreeding organisms, the purpose of crossing is, of course, to pro¬ 
duce superior crossbred, or F 1? individuals. Consider first a set of lines all derived 
from the same random-breeding base population. The crosses must then be superior 
not only to the inbred lines but to the outbred population from which the lines were 
derived. Something more than heterosis is therefore sought, since heterosis is the 
superiority over the inbred lines. It was shown in Chapter 14 that when lines are 
inbred without selection the mean of all their crosses is expected to be equal to the 
mean of the outbred population from which they were derived. Therefore inbreeding 
and crossing alone cannot produce any improvement; there must be selection at some 
stage if any improvement is to be made. 

The lines that are crossed are, however, usually derived from different base popula¬ 
tions. Some of the superiority of the crosses then comes from heterosis. If the two 
base populations differ in gene frequencies, a cross between them will show heterosis, 
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as explained in Chapter 14. In the same way, the mean of crosses between sets of 
inbred lines derived without selection from the two base populations will be superior 
to the mean of the two base populations. Some improvement would therefore be 
achieved even without any selection. With animals, the lines crossed already exist 
and there is no existing base population from which they were derived. All the gain 
from crossing is therefore heterosis, and the only selection is in the choice of which 
lines or breeds to cross. There is nothing to add here to the account of heterosis 
given in Chapter 14. We are concerned therefore with the selection by which most 
of the improvement is achieved in plants. It will be assumed for simplicity that all 
the lines to be crossed are derived from the same base population. 

Some improvement can be expected from the effects of natural selection. It 
eliminates lethal and severely deleterious genes during the inbreeding and, in so far 
as these genes affect the desired character, an improvement of the crossbred mean 
over that of the base population is to be expected. But this improvement will not 
be very great, because the deleterious genes eliminated will have been at low fre¬ 
quencies in the base population — and the more harmful, the lower the frequency 
— so that their effect on the population mean will be small. It has been calculated, 
on the basis of assumptions about the number of loci concerned and their mutation 
rates, that an improvement of 5 per cent in fitness is the most that could be expected 
from the elimination of deleterious recessive genes (Crow, 1948, 1952). Most of 
the improvement, therefore, must come from artificial selection applied to the 
economically desirable characters. The methods of applying this artificial selection 
are the main topic for consideration. There is, however, another question that must 
be considered at the same time. Developing inbred lines and evaluating their crosses 
is a long and costly process. What are the advantages of this method over straight¬ 
forward selection applied to the original outbreeding population? This question can 
be partly answered now. 

The crossing of inbred lines produces no genotypes that could not occur in the 
base population. But whereas the best genotypes occur only in certain individuals 
in the base population, they are replicated in every individual of certain crosses. 
It is in this replication of a desirable genotype that the chief merit of the method 
lies. When a good cross has been found, its genotype can be produced in any required 
number of individuals and, by repeating the cross, in successive generations. Fur¬ 
thermore, the genetic identity of the F t individuals gives them a phenotypic unifor¬ 
mity which is an economic benefit, particularly for mechanical harvesting. For 
example, uniformity of ripening time means that all individuals are ready for 
harvesting at the same time. Though the genotype of a cross might be found in an 
individual of the base population, the replication of the genotype in the cross allows 
the genotypic value to be measured with little error; whereas the genotypic value 
of an individual in the base population is only crudely measured by its phenotypic 
value. Further, it is the genotypic value that is measured in the cross and can be 
reproduced indefinitely, as long as the inbred lines are maintained; whereas only 
the breeding value can be reproduced by selection of individuals in a non-inbred 
population. Therefore the condition under which inbreeding and crossing are likely 
to be a better means of improvement than selection without inbreeding is when much 
of the genetic variance of the character is non-additive. 
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Selection for combining ability 

Ultimately the breeder is looking for the pair of inbred lines among all those available 
that will give the best cross. In other words, the selection is ultimately to be applied 
to the crosses. The amount of improvement that can be made by selection among 
a number of crosses depends on the amount of variation between the crosses, and 
on the intensity of selection as described in Chapter 11. To get a high intensity of 
selection requires a large number of crosses, and to get the maximum amount of 
variation between the crosses requires the lines to be inbred to a high level, as was 
shown in the previous chapter. Time and space can, however, be saved by applying 
some preliminary selection to the lines. This can be done in two ways. First, the 
lines to be used for the cross finally selected must themselves be reasonably produc¬ 
tive as inbreds. Lines are therefore selected first for their own performance. A line’s 
inbred performance is correlated with its performance in crosses to some extent 
depending on how much of the variance is due to additive genes. The correlation, 
however, is rather small — about 0.1 for yield in maize (Gama and Hallauer, 1977) 
— so the improvement of the crosses expected from selection of the lines for their 
performance as inbreds is not very great. Second, the value of a cross is made up 
of two parts, as explained in the previous chapter: the general and specific combin¬ 
ing abilities of the two parent lines. The general combining abilities of the lines can 
be tested in a variety of ways, outlined in the previous chapter, without the necess¬ 
ity of making all possible crosses. Furthermore, a useful guide to the general 
combining ability can be obtained from lines that are not yet fully inbred. 

The improvement made by the preliminary selection of the lines for their general 
combining ability comes from the additive variance in the base population. Any fur¬ 
ther improvement, making use of the non-additive genetic variance, must come from 
selection for specific combining ability. Here there is no way of selecting the lines 
by preliminary tests; the crosses must be made, from which to select the best. Since 
the variance of specific combining abilities is proportional to the square or higher 
powers of the inbreeding coefficient (equation [15.8]), the lines must have reached 
a high level of inbreeding before much can be gained from selection for specific 
combining ability. 

Relative importance of general and specific combining abilities. How much of the 
improvement is expected to come from general combining ability and how much 
from specific combining ability? If the intensity of selection applied to each is the 
same, the relative amount of improvement due to each will be proportional to their 
variances. If the lines are fully inbred, the variance of general combining ability 
is equal to the additive variance in the base population and the variance of specific 
combining ability is equal to the non-additive variance (equation [15.8]). So if the 
variance components in the base population are known, the relative amount of 
improvement from the two combining abilities can be predicted. In an open-pollinated 
variety of maize the additive variance of yield as a proportion of the total was 0.149, 
and the non-additive variance was 0.032 (Gardner, 1977). The proportion of additive 
variance in the total genetic variance was thus 0.149/0.181 = 0.823. Therefore if 
inbreeding and crossing were applied to this population, and selection was applied 
to the crosses, about 80 per cent of the improvement would be expected to come 
from general combining ability and only about 20 per cent from specific combining 
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ability. If these variance components are characteristic of maize populations, it seems 
that by far the largest part of the improvement in hybrid maize comes from general 
combining ability and ultimately from additive variance in the base population. This 
raises the question of how far yield could be improved by selection in a random¬ 
breeding population without inbreeding. Several experiments have shown that selec¬ 
tion in open-pollinated varieties is effective. In one, the population responded for 
14 generations with a total improvement of 40 per cent in yield (Gardner, 1977). 
An improved open-pollinated variety does not have the uniformity which is an 
important feature of inbred-crosses, but prior selection in the base populations is 
an effective way of increasing the general combining abilities of inbred lines subse¬ 
quently made from them. 

Synthetic populations. When inbred lines have been made and selected for their 
combining abilities, no further improvement can be made to the crosses of those 
lines. To achieve further improvement a new set of inbred lines must be made from 
an improved base population. The new base population may have been improved 
by selection without inbreeding, or it can be constructed from the selected inbred 
lines. Crossing a number of selected inbred lines and allowing the Fj and later 
generations to cross-pollinate at random creates a new synthetic population. The 
improved general combining ability of the lines, being based on additive variance, 
is retained in the synthetic population. Segregation in the F 2 and later generations 
then allows a new set of inbred lines to be made with gene combinations different 
from those of the lines used to construct the population. In this way, further 
improvement of combining abilities can be achieved. The hybrid maize currently 
in use is the product of two or more such cycles of inbreeding and crossing. In addi¬ 
tion to the improvement of the hybrids, the yield of the lines as inbreds is also 
improved, which is important economically because the hybrid seed sold for 
commercial growing must be produced by an inbred parent. 

Three-way and four-way crosses; backcrosses 

The practical difficulties associated with the low productivity of inbred lines can 
be overcome by the use of 3-way and 4-way crosses, though with some loss of per¬ 
formance and of uniformity in the crosses. These crosses were widely used for the 
production of hybrid maize until the improved inbreds mentioned above were 
available. In a 3-way cross the F] of two lines is used as female or seed parent, 
in which high productivity is required, and the F] is then crossed with a third line. 
In a 4-way cross, or double-cross , two F^s of different pairs of lines are crossed. 
If lines derived from different base populations are available, then in order to make 
use of the inter-population heterosis the final cross is made between lines of dif¬ 
ferent origin. For example, if lines A and B are from one origin and lines P and 
Q from another origin, a 4-way cross is made as A X B and P X Q, followed by 
AB x PQ. The performance of 3-way and 4-way crosses can be reliably predicted 
from the performances of single crosses of the constituent lines, provided there is 
no epistatic interaction. Consider for example the 3-way cross (Ax B) x P. Let 
these letters represent alleles at a single locus carried by the corresponding lines. 
The F, of A X B then has the genotype AB, which when crossed with line P gives 
two genotypes, AP and BP in equal proportions. These are the genotypes of crosses 
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of lines A x P and B x P respectively. Therefore the mean performance of these 
two crosses predicts the performance of the 3-way cross. In the same way, the per¬ 
formance of the 4-way cross (A x B) X (P x Q) is predicted by the mean of A 
X P, A X Q, B x P, and B X Q. If more than one locus is considered, however, 
segregation in the Fj parents produces genotypes in the final cross that could not 
appear in any single cross of the lines used. Therefore if there is epistatic inter¬ 
action, the single crosses will not predict the final cross accurately. 

The population produced by any particular 3- or 4-way cross is a mixture of 
genotypes, all of which could in principle have been produced by single crosses, 
but 3- and 4-way crosses are expected to differ from single crosses in the following 
ways: (1) If the lines crossed have been selected, and if any of the consequent 
superiority of their single crosses is due to epistatic interactions, some of this 
superiority will be lost in the 3- and 4-way crosses. (2) There is genetic variation 
within crosses and a consequent loss of phenotypic uniformity. (3) The variance 
between crosses is reduced and the best 3- or 4-way cross is consequently not as 
good as the best single cross. For experimental comparisons of 3- and 4-way crosses 
with single crosses, illustrating these consequences in maize, see Weatherspoon 
(1970), and Otsuka, Eberhart, and Russell (1972). 

Another way of avoiding the low productivity of inbred lines is by a backcross. 
Here only two lines are involved, the F t being mated to one of the lines used in 
the first cross, i.e., (A X B) X B. The genotypes in the progeny of the backcross 
are AB and BB in equal proportions. Therefore, in the absence of epistatic inter¬ 
actions, the mean of the backcross is equal to the mean of the Fj and the line used 
in the second cross. Consequently there is less heterosis in backcrosses than in 3-way 
or 4-way crosses. 

Crosses in animals. Crossing is widely used in animal production, most of the 
animals produced for meat being the progeny of either a 3-way cross or a backcross. 
As was noted earlier, the lines crossed are not deliberately inbred. They have, 
however, been previously selected for desirable characters and have become mildly 
inbred with a consequent reduction of some desirable characters, particularly fer¬ 
tility. The purpose of the crossing is partly to make use of heterosis to improve fer¬ 
tility and partly to combine the different characteristics for which the lines were 
previously selected. For meat production a desirable quality in the final product is 
rapid growth and what is desired of the final cross is to produce large numbers of 
rapidly growing individuals. This requires good fertility in the mother coupled with 
good growth rate in the progeny. Accordingly, the first cross (A X B) is made to 
produce Fj’s with good fertility, which comes from heterosis. These Fj(AB) in¬ 
dividuals are used as mothers and crossed to a third line (C) with good growth rate 
to produce the (A X B) X C progeny. Or if no suitable third line is available, the 
F| is backcrossed to one of the lines used in the first cross. The improved growth 
of the final progeny comes partly from heterosis and partly from the additive effects 
of the sire line. Their growth rate may not always be as good as the best of the 
lines, but the increased numbers produced by the fertile AB mothers makes the cross¬ 
ing economically advantageous. The gains from different types of cross are review¬ 
ed by Dickerson (1969) and by Sheridan (1981). 
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Example 16.1 A 3-way cross of sheep breeds will serve to illustrate the gain from 
combining the heterosis of fertility with the superior growth of the sire-line. The data 
come from Sidwell, Everson, and Terrill (1962, 1964). The table gives the pure-bred 
and 3-way cross performances of the three characters: (a) fertility as the number of lambs 
weaned per ewe mated; (b) growth rate as the weight per lamb at weaning; and (c) the 
economically important character, total weight of lamb weaned per ewe mated, which 
is the product (a) x ( b ). The weaning weight of the cross was not as good as the sire- 
line itself, but the larger number of lambs weaned by the F, females made the 3-way 
cross superior to the best pure breed in respect of the total weight of the weaned lambs. 



Production per ewe mated 


(a) 

No. of lambs 
weaned 

(b) 

Weaning wt. 

(kg) per lamb 

(c) 

Total wt. 

(kg) weaned 

Pure breeds 




A = Shropshire 

0.80 

23.0 

18.4 

B = Southdown 

0.79 

19.1 

15.1 

C = Hampshire 

1.00 

29.2 

29.2 

Mid-parent (|A + i B + 2 C) 

0.90 

25.1 

22.6 

3-way cross (A x B) x C 

1.25 

27.5 

34.4 

Heterosis, % above mid-parent 

39 

10 

52 

Superiority over best breed (%) 

+25 

-6 

+ 18 


Reciprocal recurrent selection (RRS) 

The specific combining ability of a cross cannot be measured without making and 
testing that particular cross. Therefore to achieve a reasonably high intensity of selec¬ 
tion for specific combining ability, a large number of crosses must be made and 
tested. Is no short-cut possible? Could the superior combining ability not be, as it 
were, built into the lines by selection? From the causes of heterosis explained in 
Chapter 14 it is clear that what is wanted is a pair of lines that differ widely in the 
gene frequencies at all loci that affect the character and that show dominance. It 
should therefore be possible to build up these differences of gene frequency in two 
lines by selection. Instead of the differences of gene frequency being produced by 
the random process of inbreeding, they would be produced by the directed process 
of selection, which would be both more effective and more economical. Further¬ 
more, both general and specific combining ability would be selected for simul¬ 
taneously. Selection for combining ability in this way is known as reciprocal recurrent 
selection. Its theoretical basis has been examined by Comstock, Robinson, and Harvey 
(1949) and Dickerson (1952). In outline, the procedure is as follows. 

The start is made from two populations, preferably two already known to give 
some heterosis when crossed. These two populations, whose combining ability is 
to be improved, will be referred to as lines A and B. Crosses are made reciprocally, 
a number of A crcr being mated to B 99 , and a number of B crcr to A 99 • The 
crossbred progeny are then measured for the character to be improved and the parents 
are judged from the performance of their progeny. The best parents are selected 
and the rest discarded, together with all the crossbred progeny, which are used only 
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to test the combining ability of the parents. The selected individuals must then be 
remated, to members of their own line, to produce the next generation of parents 
to be tested. These are crossed again as before and the cycle repeated. Deliberate 
inbreeding is avoided because random changes of gene frequencies are not desired. 

An essential prerequisite is that there should be some difference of gene frequency 
between the two lines at the beginning, or else selection for combining ability will 
be unable to produce a differentiation of the lines. Any locus at which the gene fre¬ 
quencies are the same in the two lines will be in equilibrium, though an unstable 
equilibrium. Any shift in one direction or the other will give the selection something 
to act on and the difference will be increased. The initial difference between the 
lines may be obtained by starting from two different breeds or varieties, choosing 
two that already cross well; or by deliberate inbreeding, up to perhaps 25 per cent, 
and relying on random differentiation of gene frequencies. 

Evidence about the practical value of reciprocal recurrent selection is conflicting. 
It is used by some commercial breeders of poultry for egg production (see Krosigk 
et al., 1973), and has given promising results with maize (Eberhart, 1977). On the 
other hand, direct comparisons with other methods of selection in poultry and in 
laboratory animals have not been encouraging (see Calhoon and Bohren, 1974; 
McNew and Bell, 1976). 

Overdominance 

The question of whether inbreeding and crossing is a better method of improvement 
than selection without inbreeding hinges on overdominance as a property of the genes 
concerned. Overdominance for fitness was discussed in Chapter 2 as a mechanism 
for the maintenance of polymorphism, and the different ways in which true over¬ 
dominance and pseudo-overdominance due to linkage can arise were explained. Here 
we are concerned with overdominance for the character to be improved and in prac¬ 
tice it matters little how the overdominance arises. Both methods of improvement 
involve selection, as we have already seen, so the essential distinction is in the cross¬ 
ing. Crossing two lines in which different alleles are fixed gives an F, in which 
all individuals are heterozygotes; and this is the only way of producing a group of 
individuals that are all heterozygotes. In a non-inbred population no more than 50 
per cent of the individuals can be heterozygotes for a particular pair of alleles. Con¬ 
sequently , if heterozygotes of a particular pair of alleles are superior in merit .to 
homozygotes, inbreeding and crossing will be a better means of improvement than 
selection without inbreeding. Furthermore, it is only when there is overdominance 
with respect to the desired character, or combination of characters, that inbreeding 
and crossing can achieve what selection without inbreeding cannot. Under any other 
conditions of dominance the best genotype is one of the homozygotes and all 
individuals can, in theory, be made homozygous by selection, without the disadvan¬ 
tages attendant on inbreeding and much more simply than by methods dependent 
on crossing. It was stated earlier in this chapter that the potentialities of inbreeding 
and crossing are greatest when there is much non-additive genetic variance and lit¬ 
tle additive. Now we see that this is only part of the truth: in theory, and leaving 
all practical considerations aside, inbreeding and crossing can surpass selection 
without inbreeding only when there is at least some degree of overdominance of 
the genes concerned. 
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A variety of experimental work on both plants and animals, some of which has 
been mentioned in earlier chapters, suggests that overdominance for the characters 
studied is not an important property of the genes. This is true even of yield in maize 
for which inbreeding and crossing has been so successful (Eberhart, 1977). There 
seems, therefore, to be little theoretical justification for believing that inbreeding 
and crossing is a better way of increasing the mean of the desired character. Its 
advantages are mainly in the uniformity rather than in the improved mean. 

Naturally self-fertilizing plants 

Self-fertilizing plants usually show a small amount of heterosis when crosses are 
made. The reason for this is presumably that a few deleterious mutant genes have 
been fixed by the inbreeding despite the selection against them. The heterosis, 
however, is much less than is shown by outbreeding plants. For example, the heterosis 
in crosses between varieties of wheat and of barley (inbreeders) amounts to about 
10 per cent (Geiger, 1988), whereas in crosses between inbred lines of maize (an 
outbreeder) the heterosis is commonly about 150 per cent. To make use of the 
heterosis for commercial growing is, however, not easy because making the crosses 
is usually difficult technically, and to produce hybrid seed requires crossing on a 
large scale in every generation. There are several ways of overcoming the technical 
difficulties of crossing but these have been successful with only a few crops (see 
Simmonds, 1979, p. 231). The purpose of the crossing is therefore usually to generate 
segregation. After a cross has been made, the F| and subsequent generations are 
allowed to self-fertilize, producing a new set of inbred lines which become differen¬ 
tiated by recombination and random drift. The aim is to find one or more of these 
recombinant lines that is better than either of the parental lines. The amount of 
improvement to be expected from any particular cross can be predicted from the 
additive genetic variance in the F 2 and the intensity of selection that will be applied 
to the recombinant lines (see Jinks and Pooni, 1976; Pooni and Jinks, 1979). 


Problems 

16.1 If you were to make a three-way cross and a four-way cross of the varieties 
in Problem 15.7, which varieties would you choose, and how would you make the 
crosses, in order to get the highest predicted yields? What would the predicted yields 
be? 


[Solution 96] 


16.2 A ‘rotational cross’ with two breeds or lines, A and B, is made as follows, 
where X always refers to the crossbred generation. 

(1) A X B 

(2) X, x A 

(3) X 2 x B 

(4) X 3 x A 


Calculate the expected performance of the crossbred progeny in each generation up 
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to (4), in terms of the purebred and single-cross performances, assuming no epistatic 
interaction and no maternal effects. 


[Solution 106] 

16.3 A rotational cross with 3 breeds, A, B, C, is made as follows, where X again 
represents the crossbred progeny. 

(1) A X B 

(2) X, X C 

(3) X 2 x A 

(4) X 3 X B 

(5) X 4 x C 

Calculate the expected performance of each generation in terms of the purebred and 
single-cross performances, assuming the absence of epistasis and maternal effects. 

[Solution 116] 

16.4 Suppose that all the single crosses of the lines used in rotational crossing show 
the same amount of heterosis. What proportion of this single-cross heterosis will 
be attained in successive generations of the rotational cross? Work this out for the 
four generations of the two-line rotational cross in Problem 16.2 and for the five 
generations of the three-line rotational cross in Problem 16.3. 


[Solution 126] 

16.5 When rotational crossbreeding is applied to animals the females used for cross¬ 
ing are always crossbred and the males always purebred. The system then has the 
useful feature that the pure breeds themselves need produce no more females than 
are required for replacements. The data below are the mean weights of individual 
pigs at 154 days (W) and the mean number of pigs per litter at 154 days (TV), in 
three breeds, Chester White (C), Duroc (D) and Yorkshire (Y), and their single 
crosses. From the solution of Problem 16.3 calculate the expected mean total weight 
per litter in each of the 5 generations of rotational crossing, starting with (Y 9 X 
Co*) 9 X Dcr. Make the simplifying assumptions that the number in the litter 
depends only on the mother’s genotype and the weight depends only on the in¬ 
dividual’s genotype, and that epistasis is negligible. 


C D Y CD CY DY 


W (kg) 78 88 84 92 90 96 

N 6.6 6.3 7.9 7.4 8.2 7.3 


Data from Schneider, J.F. (1978) Ph.D. Thesis, Iowa State Univ., Ames, Iowa, USA. 

[Solution 136] 



SCALE 


The choice of a suitable scale for the measurement of a metric character has been 
mentioned several times in the foregoing chapters. The explanation of what is involved 
in the choice of a scale and a discussion of the criteria of suitability have, however, 
been deferred till this point because these are matters that cannot be properly ap¬ 
preciated until the nature of the deductions to be made from the data are understood. 
In other words, the choice of a scale has to be made in relation to the object for 
which the data are to be used. The data from any experimental or practical study 
are obtained in the form most convenient for the measurement of the character. That 
is to say, the phenotypic values are recorded in grams, centimetres, days, numbers, 
or whatever unit of measurement is most convenient. The point at issue is whether 
these raw data should be transformed to another scale before they are subjected to 
analysis or interpretation. A transformation of scale means the conversion of the 
original units to logarithms, reciprocals, or some other function, according to what 
is most appropriate for the purpose for which the data are to be used. 

It is tempting to suppose that each character has its ‘natural’ scale, the scale on 
which the biological process expressed in the character works. Thus, growth is a 
geometrical rather than an arithmetical process, and a geometric scale would appear 
to be the most ‘natural’. For example, an increase of 1 g in a mouse weighing 20 g 
has not the same biological significance as an increase of 1 g in a mouse weighing 
2 g; but an increase of 10 per cent has approximately the same significance in both. 
For this reason a transformation to logarithms would seem appropriate for 
measurements of weight. This, however, is largely a subjective judgement, and some 
objective criterion for the choice of a scale is needed. Different criteria, however, 
are often inconsistent in the scale they indicate and, moreover, the same criterion 
applied to the same character may indicate different scales in different populations. 
Therefore the idea that every character must have its ‘natural’ and correct scale is 
largely illusory. 

There are, broadly speaking, three main reasons for making a scale transforma¬ 
tion: (1) to make the distribution normal; (2) to make the variance independent of 
the mean; and (3) to remove or reduce non-additive interactions. The criterion for 
the choice of a scale is in each case the empirical one of achieving the particular 
objective. When a scale transformation is called for but is not made, certain 
phenomena arise, called scale effects, which disappear when the appropriate transfor¬ 
mation is made. The objectives noted above might equally well be stated as being 
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the removal of these scale effects. We shall discuss in particular the logarithmic 
transformation which converts an arithmetic to a geometric scale. This is probably 
the commonest and most useful transformation. Other transformations are described 
by Wright (1968, Ch. 10). The general principles, outlined by reference to the log 
transformation, apply equally to other transformations. 

Distribution and variance 

Consider first the distribution of phenotypic values. Figure 17.1 shows three distribu¬ 
tions plotted as if from the original data on an arithmetic scale. They would all three 
be symmetrical and normal if the data were first transformed to logarithms. There 
are two points of importance to notice. First, the degree of departure from norm¬ 
ality depends on the amount of variation in relation to the mean. This may be seen 
from a comparison of the two upper graphs, (a) and (b), which are not very noticeably 
asymmetrical, with the lower graph, (c), which is. The relationship between the 
amount of variation and the mean, which determines the degree of departure from 
normality, is best expressed as the coefficient of variation, i.e., the ratio of standard 
deviation to mean, often multiplied by 100 to bring it to a percentage. The coeffi¬ 
cient of variation of the two upper graphs is 20 per cent, while that of the lower 
graph is 50 per cent. Thus, a transformation to logarithms does not make an 
appreciable difference to the shape of the distribution unless the coefficient of varia¬ 
tion is fairly high — that is, above about 20 per cent or so. Consequently, statistical 
procedures which do not rely on a strictly normal distribution, such as the analysis 
of variance, can be carried out on the untransformed data when the coefficient of 




Fig. 17.1. Distributions that are symmetrical and normal on a logarithmic scale shown plotted 
on an arithmetic scale. Explanation in text. 
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variation is not above about 20 per cent. Transformations to other scales are also 
less necessary when the coefficient of variation is low than when it is high. 

The second point to notice in Fig. 17.1 is that the variance, when computed in 
arithmetic units, increases when the mean increases. This may be seen in graphs 
(a) and ( b ). These both have the same variance in logarithmic units, but different 
means. The mean — or strictly speaking the mode — of ( b ) is double that of (a) 
and the standard deviation in arithmetic units is correspondingly doubled. Though 
the distributions are not very noticeably skewed and a transformation does not seem 
to be very strongly indicated, yet in consequence of the difference of mean the 
variances differ very greatly. Here, then, is one of the commonest scale effects, 
namely a change of variance following a change of the population mean. The two 
graphs (a) and ( b ) in Fig. 17.1 might well represent two populations which have 
diverged by some generations of two-way selection, if the character were something 
like body size measured in units of weight. Such characters are commonly found 
to increase in variance when the mean increases, and to decrease in variance when 
the mean decreases. Figure 17.2 shows an example from an experiment with mice, 
the character being weight at 60 days. Note that none of the three distributions con¬ 
sidered separately seems to be sufficiently asymmetrical or non-normal to need a 
scale transformation on this criterion; but to make the variance independent of the 
mean, a transformation is very obviously required. 



Fig. 17.2. Distributions of body weight of male mice at 60 days. Centre: base population before 
selection. Left and right: small and large strains after 21 generations of two-way selection. 
{Based on MacArthur, 1949.) 



Small 

Unselected 

Large 

Mean 

11.97 

23.16 

39.85 

Standard deviation 

1.71 

2.56 

5.10 

Coeff. of variation, % 

14.3 

11.1 

12.8 
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Phenomena such as the change of variance discussed above are called scale effects 
if they disappear when the measurements are appropriately transformed: in other 
words, if their cause can be attributed to the scale of measurement. But they are 
none the less real, though labelled as a scale effect or removed by transformation. 
The large mice, for example, are really more variable than the small when their 
weights are measured in grams. What is gained by recognizing this as a scale effect 
is that there is no need to look deeper into the genetic properties of the character 
for an explanation. 

A convenient test for the appropriateness of a logarithmic transformation is pro¬ 
vided by the proportionality of standard deviation and mean, which we noted in con¬ 
nection with graphs (a) and ( b ) in Fig. 17.1. If two distributions have the same 
variance on a logarithmic scale then the coefficients of variation in arithmetic units 
will be the same. Thus, constancy of the coefficient of variation indicates constancy 
of variance on a logarithmic scale. And, if variances are to be compared, we may 
simply compare the coefficients of variation instead of expressing the variances in 
logarithmic units. The standard deviations and coefficients of variation of the distribu¬ 
tions shown in Fig. 17.2 are given in the legend to the figure. The coefficients of 
variation, though not identical, are much more alike than the standard deviations, 
and this shows that the changes of variance that have resulted from the selection 
can be attributed, in large part at least, to the scale of measurement. 

When a logarithmic transformation is required, it is not always necessary to con¬ 
vert each individual measurement. Conversion of the mean and of the variance can 
conveniently be made by the following formulae (Wright, 1968, p. 229): 

(log x) = log x- j log(l + C 2 ) ... [ 17 . 1 ] 

alogx) = 0.43431og(l + C 2 ) ... [17.2] 

The first converts the mean of arithmetic values to the mean of logarithmic values, 
and the second converts the variance as computed from the arithmetic values to the 
variance as it would be computed from logarithmic values. In these formulae C is 
the coefficient of variation in the form o/x computed from arithmetic values, and 
the logarithms are to the base 10. The formulae are accurate only if the distribution 
really is normal on the logarithmic scale. 

When conclusions about variances depend critically on eliminating any scale effect, 
it may be necessary to find the empirical relationship between variance, or standard 
deviation, and mean. This can be done if several populations with different means 
are available, and if there are no reasons other than the scale effect for thinking 
that their variances would differ. Then the regression equation relating the standard 
deviation to the mean gives the expected standard deviation in another population 
with a particular mean. If the regression is linear the regression equation is a = 
a + bx, where a is the expected standard deviation in a population with a mean of 
x, a is the intercept and b is the regression coefficient. A scale on which the variance 
would be independent of the mean would be X = log(jc + a/b), where jc is the original 
measurement and X its transformed value (Wright, 1968, p. 232). 

Let us return to the consequences of selection and pursue them a little further. 
If the variance changes with the change of mean as a result of selection, so also 
will the selection differential and the response. The response per generation of a 
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Fig. 17.3. Response to two-way selection for resistance to dental caries in rats. Resistance is 
measured in days and plotted on an arithmetic scale in (a), and on a logarithmic scale in ( b ). The 
arithmetic means were converted to logarithmic means by formula [17.1]. The coefficient of 
variation was high - about 50 per cent - and was approximately constant. The reason why the 
upward selection has not covered so many generations as the downward is simply that the 
increased resistance lengthened the generation interval. {Data from Hunt, Hoppert, and Erwin, 
1944.) 


character such as we have been considering would therefore be expected to increase 
with the progress of selection in the upward direction, and to decrease correspond¬ 
ingly in the downward direction. The response to two-way selection would then be 
asymmetrical. An example of an asymmetrical response which can most probably 
be attributed to a scale effect in this way is shown in Fig. 17.3. Plotted in arithmetic 
units, as in (a), the response is much greater in the upward than in the downward 
direction. A transformation to logarithms, shown in ( b ), renders the response much 
more nearly symmetrical. This does not do away with the fact that the character 
as measured increased much more than it decreased under selection. But it accounts 
for the asymmetry without the need for more elaborate hypotheses. A convenient 
way of eliminating scale effects from the graphical presentation of a response to 
selection is to plot the response in the form of the realized heritability, as explained 
in Chapter 11 and illustrated in Fig. 11.5. The realized heritability, which is the 
ratio of response to selection differential, is very little influenced by scale effects 
(Falconer, 1954). 

Interactions 

We turn now to what is perhaps a more fundamental effect of a scale transformation 
— its effect on the apparent nature of the genetic variance. To understand this we 
must go back to a single locus and consider the effect, or mode of action, of the 
genes. Imagine a locus with two alleles whose mode of action is geometric, the 
genotypic value of A 2 A 2 being 50 per cent greater than A^, and that of AjA 2 
being also 50 per cent greater than A^. Thus on the logarithmic scale there is 
no dominance, the heterozygote being exactly mid-way between the two homozygotes. 
Now suppose the genotypic values are measured in arithmetic units, such as grams, 
and that A]A, has a value of 10 units. Then A t A 2 will be 15 units and A 2 A 2 22.5 
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units. On the arithmetic scale, therefore, is partially dominant to A 2 , the 
heterozygote no longer falling mid-way between the homozygotes. Thus the degree 
of dominance is influenced by the scale of measurement, and so also is the propor¬ 
tionate amount of dominance variance. This effect of a scale transformation, however, 
is normally rather small. A gene that causes a 50 per cent difference between the 
genotypic values, such as we have considered, would be a major gene, easily 
recognizable individually. But even so, the degree of dominance on the arithmetic 
scale is not very great. Minor genes with effects of perhaps 1 per cent or 10 per 
cent would be scarcely influenced in their dominance. 

In the same way that the dominance is affected by the scale, so also is the epistatic 
interaction between different loci. Loci with geometric effects would combine without 
interaction if the genotypic values were measured in logarithmic units. But when 
measured in arithmetic units there would be interaction deviations due to epistasis. 
Thus the amount of interaction variance is also influenced by the scale of measure¬ 
ment. The following example illustrates the dependence of interaction on scale. 

Example 17.1 The pygmy gene in mice is a major gene affecting body size, homozygotes 
being much reduced in size. The effect of this gene was studied in different genetic 
backgrounds (King, 1955). The gene was transferred from the strain selected for small 
size where it arose, to a strain selected for large size, by repeated backcrosses. The mean 
difference between pygmy homozygotes and normals (i.e., heterozygotes and normal 
homozygotes together) was measured in the two strains and during the transference, the 
comparisons being made between pygmies and normals in the same litters. The results 
are shown in Fig. 17.4. The difference between pygmies and normals increases with 
the weight of the normals. In the background of the small strain the pygmies were about 
7 g smaller than normals, but in the background of the large strain they were about 14 g 
smaller. Thus the pygmy gene shows epistatic interaction with the other genes that affect 
body size. But if the effect of the gene is expressed as a proportion, it is constant and 
independent of the other genes present. Pygmies are about half the weight of their nor¬ 
mal litter-mates, no matter what the actual weights are. Thus if the comparisons are made 
in logarithmic units there is no epistatic interaction. 

In general, therefore, a scale transformation may remove or reduce the variance 
attributable to epistatic interaction, and this variance might then be labelled as a scale 
effect. A transformation which removes or reduces interaction variance may be useful 
if conclusions are to be drawn from an analysis that depends for its validity, on the 
absence of interaction. Interactions between genotype and environment may also 
arise from a scale effect, and a transformation may be useful for removing or reduc¬ 
ing them. Interactions, whether epistatic or genotype x environment, however, can¬ 
not always be removed or even reduced by a transformation of scale. For example, 
no meaningful transformation can remove an interaction that causes a reversal of 
order, such as was illustrated in Fig. 8.2. 

There are two ways by which a suitable scale for removing interactions can be 
found. The first is by comparing the effects of a specific factor and finding a scale 
that makes the effect additive. This was illustrated in Example 17.1 above. The effects 
of a single gene were compared in different genetic backgrounds, and it was found 
that on a logarithmic scale the gene added the same amount on all genetic 
backgrounds. The specific factor whose effects are to be made additive can be 
environmental rather than genetic. The second test of a suitable scale for removing 
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Fig. 17.4. Effect of the pygmy gene in mice of different body weights. Difference of 6-week 
weight between pygmies and their normal litter-mates plotted against the weight of the normals 
in the same litter. The straight line is not a fitted regression, but shows the relationship: weight 
of pygmy = 0.5 x weight of normal. See Example 17.1. {Data from King, 1955.) 


epistatic interaction can be applied when two populations with different means are 
available and can be crossed. It was shown in Chapters 14 and 16 that in the absence 
of epistatic interaction the means of the F 2 and backcross generations were expected 
to be as follows: 


f 2 = HF, + P) j 

P, = HF| + Pi) } - t 17 ; 3 J 

b 2 = HP, + p 2 ) ) 

where P is the mean of the two parental populations and all the other symbols are 

the means of the corresponding generations. A scale is chosen which brings the 
observed means closest to their expectations. For details, see Mather and Jinks (1982, 
P- 71). 


Conclusions 

In this chapter we have outlined some of the scale effects most commonly met with, 
and have indicated the circumstances under which a transformation of scale may 
be helpful to the interpretation of results and the drawing of conclusions. Transfor¬ 
mations of scale, however, should not be made without good reason. The first pur¬ 
pose of experimental observations is the description of the genetic properties of the 
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population, and a scale transformation obscures rather than illuminates the descrip¬ 
tion. If epistasis, for example, is found, this is an essential part of the description, 
and it is better labelled as epistasis than as a scale effect. The transformation of scale 
is essentially a statistical device to be employed for the purpose of simplifying the 
analysis of the data, or to make possible the drawing of valid conclusions from the 
analysis. It is sometimes helpful also in the interpretation of results. If epistasis, 
for example, were found to disappear on transformation to a logarithmic scale we 
could conclude that the effects of different loci combined by multiplication rather 
than by addition. Or, if there were good reasons for attributing a difference of variance 
to a scale effect we should not need to invoke more complicated genetic explana¬ 
tions. The choice of scale, however, raises troublesome problems in connection with 
the interpretation of results. Logical justification of a scale transformation can only 
come from some criterion other than the property about which the conclusions are 
to be drawn. If there is no independent criterion the argument becomes circular, 
and the distinction between a scale effect and some other interpretation becomes 
meaningless. There is also a more fundamental difficulty: the scale appropriate for 
one population may not be appropriate for another, and the scale appropriate to the 
genetic and environmental components of the variation may be different. This dif¬ 
ficulty is strikingly illustrated by an analysis of the character ‘weight per locule’ 
in a number of crosses between varieties of tomato (Powers, 1950). By the same 
criterion — normality of the distribution — this character was found to require an 
arithmetic scale in some crosses and a geometric scale in others; and, moreover, 
in the F 2 generations of some crosses the genetic variation required one scale while 
the environmental variation required another. 


Problems 

17.1 Figure 17.2 shows data where a transformation to logarithms is indicated if 
equality of the variances of the three lines is desired. What would be the standard 
deviations of log-weights? The data show a large amount of asymmetry in the 
responses to selection. Would this asymmetry be removed by transformation to logs? 

[Solution 10] 

17.2 The data below are the 6-week weights (g) of mice from selected strains of 
different body weights and crosses of these strains. There were three ‘size groups’, 
large, control and small. In each size group there were six replicate lines. (Figure 
12.1 refers to these lines.) Crosses were made between lines of the same size group 
and between lines of different size groups. The object was to find out if the heterosis 
would be greater in crosses between lines of different size than in crosses within 
size groups. The first row of the table gives the mean weights of the replicates in 
each size group and the rest of the table gives the mean weights of the crosses, 
reciprocals averaged. How would the conclusions about the heterosis be altered by 
transformation of the means to logarithms? 
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Large 

Control 

Small 

Lines means 

28.74 

20.99 

14.91 

Large 

30.68 

26.00 

21.85 

Control 

— 

21.91 

18.48 

Small 

— 

— 

14.84 


Data from Kumar, C.K.B. (1980) Ph.D. Thesis, University of Edinburgh. 

[Solution 20] 


17.3 The table shows the mean number of stemopleural bristles in Drosophila males 
with different combinations of X chromosome and autosomes. There were two X 
chromosomes, one from a high bristle-number line and the other from a low bristle- 
number line. These two X chromosomes were put into the background of autosomes 
from lines at different levels of bristle number. There is a strong epistatic interac¬ 
tion between the X chromosomes and the autosomes. Can you find a scale transfor¬ 
mation that will remove the inter-action and make the X chromosome effect the same 
in all backgrounds? This is not straightforward. It is helpful to know that four of 
the bristles, two on each side, are larger than the others, different in structure, and 
never absent except when the mean is very low. 


Source of X chromosome 


Autosomal 

background 

High 

Low 

Difference 

H-L 

A 

9.49 

7.75 

1.74 

B 

13.34 

10.84 

2.50 

C 

22.72 

16.36 

6.36 

D 

34.87 

24.84 

10.03 

E 

47.80 

32.80 

15.00 

Data from McPhee, C.P. & Robertson, A. 

(1970) Genet. 


[Solution 30] 



Jg THRESHOLD CHARACTERS 


There are many characters of biological interest or economic importance which vary 
in a discontinuous manner but are not inherited in a simple Mendelian manner. 
Familiar examples are susceptibility to disease, where there are two phenotypic classes 
affected or not-affected — and litter size of the larger mammals that usually bear 
one young at a time but sometimes two or three. There are also discontinuous 
anatomical differences, such as the number of vertebrae of mice, whose genetics 
has been extensively studied. Characters of this sort appear at first sight to be out¬ 
side the realm of quantitative genetics; yet when they are subjected to genetic analysis 
they are found to be inherited in the same way as continuously varying characters. 


Liability and threshold 

The clue to understanding the inheritance of such characters lies in the idea that 
the character has an underlying continuity with a threshold which imposes a discon¬ 
tinuity on the visible expression, as depicted in Fig. 18.1. When the underlying 
variable is below this threshold level the individual has one form of phenotypic 
expression, e.g., is ‘normal’; when it is above the threshold the individual has the 
other phenotypic expression, e.g., is ‘affected’. The underlying continuous variable 
has been called the liability in the context of human diseases as threshold characters, 
and this term will be used here. The continuous variation of liability is both genetic 
and environmental in origin, and may be thought of as the concentration of some 
substance, or the rate of some developmental process — of something, that is to 
say, that could in principle be measured and studied as a metric character in. the 
ordinary way. It may be a compound of several different physiological or developmen¬ 
tal processes but it is not necessary to know how these are combined to give the 
liability, or even to know what they really are. 

That the idea of an underlying variable is a realistic one can be appreciated by 
thinking of litter size. The litter size of mice or pigs, though in reality obviously 
discontinuous, can be treated as a continuous variable because there are a large enough 
number of classes. The litter size of cows has only two classes, single and twin births, 
more than two calves being exceedingly rare. But there is no reason to think that 
the physiological causes of twinning in cattle are different from those of litter size 
in mice or pigs. The underlying variable in both cases is made up mainly of the levels 
of circulating gonadotrophic hormones, which determine the number of eggs shed, 
the intra-uterine factors that affect embryonic survival and, in the case of cattle, 
the factors determining monozygotic twinning. 
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Scale of liability (standard deviations from threshold) 

Fig. 18.1. Two populations or groups with different incidences, p , of a threshold character and 
consequently different mean liabilities. The variance of liability is the same in the two groups 
and the means differ by 0.8 standard deviations of liability, x is the normal deviate of the 
threshold from the mean; i is the mean deviation of affected individuals from their group mean. 

Two classes, one threshold 

Let us first consider characters which have only two phenotypic classes with a single 
threshold separating them. The two classes will be referred to as normal and affected. 
On the phenotypic level, or visible scale, individuals can have only two possible 
values which might be designated 0 for normal and 1 for affected. Groups of in* 
dividuals, however, such as families or the population as a whole, can have any 
value, in the form of the proportion or percentage of individuals that are affected. 
This is referred to as the incidence or, in the context of human diseases, the 
prevalence. The incidence is quite adequate as a simple description of the popula¬ 
tion or group, but the percentage scale in which the incidence is expressed is 
inappropriate for many purposes, because on a percentage scale variances differ 
according to the mean. For genetic analyses, therefore, incidences must be converted 
to mean liabilities. In order to make this transformation it is necessary to define 
the liability as being normally distributed. This definition carries with it the require¬ 
ment that if we could measure the liability directly, it would be possible to render 
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its distribution normal by some scale transformation. We shall return to this require¬ 
ment later; meantime we assume that it can be met. With liability being normally 
distributed, then, the unit of liability is its standard deviation a. The mean liability 
is then related to the incidence by the (single-tailed) normal deviate x, which is the 
deviation of the threshold from the mean in standard-deviation units of liability. Values 
of x for different incidences are tabulated in Appendix Table A. 

Comparison of means. Consider two populations or groups with different incidences, 
as shown in Fig. 18.1. By how much do they differ in mean liability? For compar¬ 
ing different groups, the threshold must be defined as being fixed; that is to say, 
it is the same level of liability in all groups. The mean of any group is then expressed 
as a deviation in a units from the threshold. In other words, the threshold is taken 
as the origin or zero-point on the scale of liability. The upper group in Fig. 18.1 
has an incidence p { = 0.05, which gives x x = 1.6a!. The mean liability is thus m x 
= -1.6<7i. (Care must be taken with the signs of mean liabilities. If the threshold 
is above the mean, the mean is below the threshold and therefore negative.) Similarly, 
the lower group in Fig. 18.1, with an incidence of 0.20, has a mean of m 2 = 
-0.8a 2 . Note that the means are expressed in units of their own population’s stan¬ 
dard deviation. We can go no further with the comparison of the means unless we 
assume that the two standard deviations are the same. If this can be accepted as a 
reasonable assumption then o x = a 2 = a, and m 2 - m { = 0.8a; the means of the 
two groups differ by 0.8 standard deviations of liability. 

Heritability of liability. Suppose that the upper distribution in Fig. 18.1 represents 
a parental generation from which affected individuals are selected as parents. When 
these parents are bred they produce offspring with the lower distribution. Knowing 
the incidence in the parental generation and in the progeny, we have all that is needed 
to calculate the regression of offspring on mid-parent values of liability, and from 
this the heritability of liability. Consider first the response to the selection. This 
is the difference in mean between the parental and progeny generations which was 
given above as 0.8a assuming the variances of the two generations to be equal. In 
fact the variances will not be quite the same, but the small error introduced will 
be neglected for the moment. Now consider the selection differential. This is the 
mean liability of the affected individuals in the parent generation as a deviation from 
their population mean. The proportion of individuals used as parents may be less 
than the incidence: in other words, not all affected individuals may be used as parents. 
But so long as all parents are affected, the mean of those used is expected to be 
the same as the mean of all affected individuals. The mean of the affected individuals 
in standard deviation units is equivalent to the intensity of selection, /, correspond¬ 
ing to the incidence as the proportion selected. Values of i are tabulated in Appen¬ 
dix Table A. With the incidence of p = 0.05 the intensity of selection is i = 2.1, 
and the selection differential is therefore S = 2. la (see equation [11.5]). The regres¬ 
sion of offspring on mid-parent values is the ratio of response to selection differen¬ 
tial (equation [11.1]) and is R/S = 0.8a/2.1a = 0.38. Finally, provided there is 
no environmental resemblance between offspring and their parents, the heritability 
of liability is equal to the regression of offspring on mid-parent values. 

The calculation of the regression and heritability explained above by reference 
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to parents and offspring can be applied to any sort of relationship. Suppose that the 
upper distribution in Fig. 18.1 represents a population, and the lower distribution 
represents any specified sort of relatives, e.g., full sibs, of affected individuals. Then 
the regression calculated is that of an individual on his relative. Since the variances 
of the two groups — the population and the relatives — are approximately equal, 
the regression is approximately equal to the correlation. Thus, with the incidences 
in Fig. 18.1 the correlation of full sibs in respect of liability would be 0.38. In the 
absence of resemblance due to common environment and of dominance, the herit- 
ability would be estimated as twice the correlation of full sibs, or 0.76. 

The calculations explained above can be summarized in the following formulae. 
The correlation of liability between relatives of any specified sort is given by 


t = m R -m P = -Tp - *R _. ... (i8.il 

i i 

where the subscripts P and R refer to the population and the relatives respectively, 
m is the mean as a deviation from the threshold, x is the normal deviate of the threshold 
from the mean, and i is the mean deviation of affected individuals from the popula¬ 
tion mean. The signs are appropriate to a scale that assigns a higher liability to affected 
than to normal individuals. The heritability is then obtained from the correlation as 

h 2 = t/r ... [18.2] 

as in equation [10.5], r being the coefficient of relationship as given in Table 9.3. 
For first-degree relatives r is for second-degree, 1; when the relatives are off¬ 
spring of parents both of which are affected, r is 1. 

The sampling variance of the mean liability, i.e. the normal deviate x, is 

a 2 = (1 - p)/i 2 A 

where A is the number affected and p is the incidence. The sampling variance of 
the correlation estimated by equation [18.1] is rather complex (see Falconer, 1965a), 
but if the population incidence can be assumed to have negligible error it is given by 

of = (1 -Pr)/*'p*r^r (approx.) 

where subscripts P and R refer to the population and the relatives respectively and 
A is the number affected. 

The error introduced by assuming the variance to be the same in the relatives of 
affected individuals as it is in the population as a whole leads to the correlation 
estimated by equation [18.1] being too low by a factor of 5 or 10 per cent. A modified 
formula that takes account of the unequal variances is the following (Reich, James, 
and Morris, 1972): 


X - xj[l - (x 2 - xjj)(l - (*/*))] 
i + Xr(* - x) 


... [18.3] 


where x and i without subscript refer to the population, and x R refers to the relatives; 
the sign of the square root is taken to make t between 0 and 1. 


Example 18.1 Cryptorchidism is a congenital defect of males that occurs in some herds 
of pigs. The data in the table refer to one herd reported by Mikami and Fredeen (1979). 



304 


18 Threshold characters 


The incidence in the herd, i.e., the population, was 3.9 per cent and the incidence among 
the full sibs of affected males was 11.6 per cent. The corresponding values of x and 
i taken from Appendix Table A are given in the table. The full-sib correlation is calculated 
by equation [18.1] as follows: 

1.762 - 1.195 

t = - = 0.26 

2.165 

Assuming no common environment and no dominance, the heritability of liability, by 
equation [18.2], is 

h 2 = 2t = 0.52 

Calculated by the more accurate formula in equation [18.3], the correlation is r = 0.28 
giving h 2 = 0.56. 



Numbers 


Incidence 




Affected 

Total 

p% 

X 

i 

Population (P) 

44 

1,129 

3.9 

1.762 

2.165 

Full sibs (R) 

25 

215 

11.6 

1.195 



The calculation of the heritability of liability has been widely applied in the study 
of the inheritance of human diseases. The data are collected by questioning patients 
with a particular disease about the disease status of their relatives. The correlation 
in respect of liability can then be calculated as above. The interpretation of the cor¬ 
relation in terms of the heritability, however, is subject to the uncertainties about 
resemblances due to common environment that were emphasized in Chapter 10. 
Estimates obtained for the heritability of liability, assuming no environmental 
resemblance, range from 85 per cent for schizophrenia to 35 per cent for congenital 
heart diseases (Emery, 1976, p. 54). Knowledge of the heritability is useful in genetic 
counselling for calculating recurrence risks in families because it allows all the 
information about the family to be combined correctly. For further details about 
the application to human diseases, see Falconer (1965a, 1967), and Curnow and 
Smith (1975). 

Adequacy of the liability model 

The definition of liability as a normally distributed continuous variable must be 
examined more closely. It implies, as noted earlier, the assumption that the underlying 
variables that combine to give the liability could be made normal by a scale 
transformation. This in turn implies that the distribution of liability is unimodal. 
If in reality it is bimodal or multimodal, no reasonable scale transformation could 
make its distribution normal. The calculations of correlations and heritabilities would 
then be invalid. A bimodal or trimodal distribution could arise in two main ways; 
first, if there was a single gene whose effect on liability was fairly large in relation 
to the residual variation, and second, if there were an environmental factor with 
an effect large in relation to the other variation. Such an environmental factor affect¬ 
ing liability to a disease might be exposure to a pathogen. Thus the genetic analyses 
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in terms of liability are valid only if liability is multifactorial, which means that there 
are many causes of variation, all with relatively small effects, and the genetic control 
is by genes at more than one or a few loci. 

There is no means of knowing in advance whether these requirements for valid 
analyses are met, because liability cannot be measured to see if its distribution is 
unimodal. One can, however, see whether the results obtained are reasonable and 
consistent. For example, a heritability in excess of 100 per cent would obviously 
be unacceptable, and would suggest a single major gene. Also, in the absence of 
resemblance due to common environment, the heritability estimated from different 
sorts of relatives should be the same. On the whole, the results obtained have been 
reasonably consistent and have given no strong reason to doubt the adequacy of the 
liability model. A method of analysis which tests the consistency of different sorts 
of relatives, and at the same time detects a single major gene, and environmental 
resemblance if present, is known as complex segregation analysis (see Morton and 
MacLean, 1974; Lalouel et al ., 1977), but it is too complicated for description here. 
A different test of adequacy can be made with characters that have two thresholds, 
which will be described below. 

Scale relationships 

It is possible to assign arbitrary values, 0 and 1, to the two phenotypic classes of 
a threshold character and to calculate the correlation between relatives in respect 
of these values. To do this is like using the phenotypic expression as a very coarsely 
graduated instrument for measuring the liability; an instrument, in fact, with only 
one graduation mark. This introduces a large amount of measurement error which 
appears as environmental variance if components of variance are estimated. The 
amount of variance due to measurement error on the (0, 1) scale depends on the 
incidence; it is least with an incidence of 0.5 and becomes larger with lower or higher 
incidences. In consequence, a correlation calculated on the (0, 1) scale varies with 
the incidence, becoming reduced as the incidence decreases or increases from 0.5. 
Transformation to the liability scale as described above removes this variance due 
to measurement error and renders correlations, and heritabilities derived from them, 
independent of the incidence. There is a simple relationship by which correlations 
or heritabilities can be converted from one scale to the other (Dempster and Lerner, 
1950). If t c is a correlation on the continuous scale of liability and t 0 i the same 
correlation calculated on the (0, 1) scale, the two are related by 



where p is the incidence and i is the corresponding mean liability. The two 
heritabilities are related in the same way. Estimates of heritabilities obtained in the 
above manner have been shown to be generally less accurate than estimates obtained 
by equation [18.3] (Mercer and Hill, 1984). 

Three classes, two thresholds 

Genetic analysis of threshold characters can be taken further if there are three 
phenotypic classes, provided the classes can be logically ordered with respect to 
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liability. That is to say, provided there are biological reasons for believing that one 
class is intermediate in liability between the other two. An example would be single, 
twin and triplet births. Then the two thresholds separating the three classes are at 
different levels of liability. The two thresholds mark fixed points on the liability 
scale, which are the same in all groups. The difference in liability between the two 
thresholds therefore provides a fixed unit of liability which is independent of the 
standard deviation. This makes it possible to compare the standard deviations or 
variances of different groups, and to compare the means of groups that are expected 
to have different variances. The idea is most easily explained by a numerical example. 

Consider the two populations illustrated in Fig. 18.2. They have different means 
and different variances. The thresholds T x and T 2 are fixed points on the liability 
scale and the difference between them is 1 threshold unit (t.u.) of liability. The scale 
of liability is shown at the bottom with the zero at the position of T x . The first step 
is to express each standard deviation in terms of threshold units. This is done from 
the incidences as follows. Let p' be the proportion of individuals above T X) i.e., 



Scale of liability in threshold units 

Fig. 18.2. A threshold character with three phenotypic classes and two thresholds. Distributions 
of liability in two populations with different means and different variances. 
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the incidence of the intermediate and extreme classes together, and let p" be the 
proportion of individuals above T 2 , i.e., the incidence of the extreme class. Let x' 
and x" be the deviations of T x and T 2 from the population mean in standard devia¬ 
tions. Then the difference between the thresholds, for any population, is (x" -x')o, 
where a is the standard deviation of the population. The difference between the 
thresholds is by definition 1 t.u., so (x" -x')a = 1 t.u., and the standard deviation 
in threshold units is given by 

a = - - -t.u. . . . [18.5] 

x" - x' 

The calculation of the standard deviations of the populations in Fig. 18.2 is as follows, 
where the values of x are found from Appendix Table A: 


Population 

p’{%) 

P"{%) 

x' 

x" 

1 t.u. 

a 

(1) 

40 

5 

0.25 

1.64 

1.39^ 

0.72 t.u. 

(2) 

36 

10 


1.28 

0.92a 2 

1.09 t.u. 


Thus the standard deviation of population (2) is found to be 1.5 times that of population 
(1). The means as deviations from 7j in threshold units are found as follows: 

m, = -x{o l = -0.25 x 0.72 t.u. = -0.18 t.u. 
m 2 = ~ x 2 a 2 = -0.36 x 1.09 t.u. = -0.39 t.u. 

Thus the mean liability of population (2) is 0.21 threshold units below that of popula¬ 
tion (1). 

When the variance can be estimated in the manner described above, it becomes 
possible to study crosses between lines with different incidences and to compare 
the variances of F ( and F 2 generations. This provides another test of the adequacy 
of the liability model, and in particular of the interpretation of one class as being 
intermediate in liability between the other two. The following example illustrates 
such a study of a cross of inbred lines of mice. The means and variances of the 
parental lines, F 1? F 2 , and the two backcrosses agree very well with what would 
be expected of a metric character; in this case there is no reason to doubt the validity 
of the threshold model. 

Example 18.2 The number of presacral vertebrae in mice varies between 25 and 27, 
presacral being defined as being anterior to the first vertebra that is fused to the sacrum. 
The character therefore reflects the longitudinal position at which the sacrum is fused 
to the vertebral column. Usually only two numbers are present in any one inbred strain, 
and we consider here two inbred strains with 25 and 26 but in very different propor¬ 
tions. A third phenotypic class is provided by the few individuals which are asymmetrical, 
having 25 on one side and 26 on the other, through having the last vertebra fused to 
the sacrum on one side only. There is clearly some doubt about the asymmetrical mice 
being intermediate in liability; they might, instead, be less well regulated in develop¬ 
ment. But treating them as intermediate seems to be justified by the genetic analysis of 
crosses. The data here refer to one of several crosses described by Green (1962). The 
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strains were C3H having 13 per cent of individuals with 26 vertebrae, and C57BL hav¬ 
ing 90 per cent. The incidences of the different generations are given in the table with 
the means and standard deviations calculated as above. The means are deviations in 
threshold units from the threshold separating ‘25’ from ‘asymmetrical’. The mid-parent 
value, and the expected means of the F 2 and backcross generations calculated from equa¬ 
tions [17.3], are also shown in the table. On the whole, the results agree well with 
expectations: the F £ is intermediate between P, and P 2 ; the F 2 has its mean near the F] 
but has a greatly increased variance; the backcross means are between the F, and paren¬ 
tal means and have variances between those of the F, and F 2 . The only disturbing 
anomaly is the very greatly different variances of the two parental lines. 



No. of 




P 




mice 

p'% 

p"% 

m 

E(m) 

a 

^ 

Pi 

p 2 

282 

619 

12.8 

96.4 

8.5 

89.8 

-4.8 1 
+ 3.4 J 

-0.7 

4.2 

1.9 

18.0 

3.6 

F, 

532 

50.0 

31.8 

0.0 

- 

2.1 

4.5 

f 2 

206 

56.8 

46.1 

+0.6 

-0.3 

3.7 

13.8 

B, 

205 

33.7 

21.5 

-1.1 

-2.4 

2.7 

7.4 

b 2 

194 

75.3 

60.8 

+ 1.7 

+ 1.7 

2.4 

5.9 


The degree of genetic determination, V G /V P , in the F 2 can be estimated from the 
difference of variance between the F 2 and the F, as follows: 


F 2 : V g + V E = 13.8 

F,: V E = 4.5 

F 2 -F,:K g - 9.3 

V G ny Q + V E ) = 0.68 

Thus 68 per cent of the variation of liability in the F 2 was genetic. This, again, is a 
very reasonable result. 

From what has been said in this chapter it will be clear that threshold characters 
do not provide ideal material for the study of quantitative genetics, because the genetic 
analyses to which they can be subjected are limited in scope and subject to assump¬ 
tions that one would be unwilling to make except under the force of necessity. If 
a continuously varying character that is closely correlated with liability can be found, 
it would clearly be better to analyse this as a metric character instead of the threshold 
character. For example, ‘time of survival’ might be used instead of ‘resistant versus 
susceptible’; or an abnormality might be graded in degrees of severity. 

Selection for threshold characters 

The application of selection to a threshold character does not involve the theoretical 
difficulties of genetic analyses. It has some practical importance in connection with 
reducing the incidence of abnormalities and with changing the response of experimen¬ 
tal animals to treatments such as, for example, increasing or decreasing drug 
resistance. We shall consider a character with two visible classes and refer first to 
individual selection. 

The response to selection depends in the usual way on the selection differential. 
But the selection differential does not depend primarily on the proportion selected, 
as with a continuously varying character, but on the incidence, for the following 
reason. We may breed exclusively from those individuals in the desired phenotypic 
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class, but we cannot discriminate between those with high and those with low 
liabilities. The selected individuals are therefore a random sample from the desired 
class and their mean is the mean of the desired class, irrespective of whether we 
select all of the desired class or only a portion of it. Thus selecting a smaller propor¬ 
tion than the incidence gives no advantage. If, on the other hand, the proportion 
that has to be selected is greater than the incidence, we shall be forced to use some 
individuals of the undesired class. Their mean liability will be below the population 
mean, so the use of undesired individuals as parents will apply some negative selec¬ 
tion. (The mean of the undesired class is easily calculated as -ip/(l - p), where 
i is the mean of the desired class whose incidence is p.) These considerations make 
it clear that the maximal selection differential is obtained when the incidence of the 
desired class is equal to the proportion that must be selected for replacement. The 
greater the difference between the two, the less effective is the selection. For this 
reason individual selection against a rare abnormality is very ineffective. As an 
example, consider an abnormality that has an incidence of 5 per cent. Then 95 per 
cent of individuals are normal and a random sample of them has a mean liability 
of 0.1a below the population mean, giving a very small selection differential. If 
we select in the other direction, for the abnormality, and have to select, for exam¬ 
ple, 20 per cent of the population as parents, then of these 20 individuals selected 
out of 100, only 5 will be abnormal, the remaining 15 being of necessity normal. 
The abnormals will each contribute a differential of +2. la and the normals each 
-0.1a, making the net selection differential 0.4a. In contrast, selection of 20 per 
cent for a continuously varying character gives a differential of 1.4a. 

Family selection for a threshold character is much more effective than individual 
selection, particularly when the incidence is low. An individual’s phenotype on the 
liability scale is very imprecisely known from its status as affected or normal. The 
mean liability of a family, however, is much more precisely known from the pro¬ 
portion of affected members. The precision depends of course on the number in 
the family, but also on the incidence in the family because the family represents 
a sample from a binomial distribution. An incidence of 50 per cent gives the best 
discrimination between families. The study quoted in Example 18.1 was made in 
order to assess the efficacy of different methods of selection. Individual selection 
was found to require 50 generations to reduce the incidence of cryptorchidism from 
5 to 1 per cent, whereas half-sib family selection was predicted to do the same in 
3 generations. Progeny testing is a form of family selection; for an assessment of 
its efficacy under various circumstances see Curnow (1984). 

The selection differential under individual selection is maximal, as we have seen, 
when the incidence of the desired class is equal to the proportion selected. In some 
circumstances it is possible to control the incidence by external means and to make 
it more nearly optimal. If, for example, the character being selected is a reaction 
to some treatment, the treatment can be intensified or reduced so that the incidence 
is altered. This changed incidence is best regarded as a shift in the threshold relative 
to the mean liability of the population. If the treatment can be further changed as 
the selection proceeds so as to keep the incidence as nearly as possible equal to the 
proportion selected, the maximal response to individual selection will be obtained. 
The progress made can be assessed by subjecting the population, or part of it, to 
the original treatment. 
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Genetic assimilation. A very interesting result of the application of this principle 
of changing the threshold by environmental means is the phenomenon known as 
‘genetic assimilation’ (Waddington, 1953). If a threshold character appears as a result 
of an environmental stimulus, and selection is applied for this character, it may even¬ 
tually be made to appear spontaneously, without the necessity of the environmental 
stimulus. In this way, what was originally an ‘acquired character’ becomes by 
perfectly orthodox principles of selection an ‘inherited character’ (Waddington, 1942). 
In such a situation there are two thresholds, one spontaneous and the other induced, 
as shown in Fig. 18.3. The spontaneous threshold is at first outside the range of 
variation of the population, so that there is no variation of phenotype and no selec¬ 
tion can be applied (Fig. 18.3(a)). The induced threshold, however, is within the 
range of liability covered by the population, and it allows individuals toward one 
end of the distribution to be picked out by selection. In this way the mean genotypic 
value of the population is changed. If this change goes far enough, some individuals 
will eventually cross the spontaneous threshold and appear as spontaneous variants 
(Fig. 18.3(b)). When the spontaneous incidence becomes high enough, selection may 
be continued without the aid of the environmental stimulus, and the spontaneous 
incidence may be further increased (Fig. 18.3(c)). 

Example 18.3 An experimental demonstration of genetic assimilation in Drosophila 
melanogaster is described by Waddington (1953). The character was the absence of the 
posterior cross-vein of the wing. In the base population no flies with this abnormality 
were present, but treatment of the puparium by heat shock caused about 30 per cent 
of cross-veinless individuals to appear. Selection in both directions was applied to the 
treated flies, and after 14 generations the incidence of the induced character had risen 
to 80 per cent in the upward selected line, and fallen to 8 per cent in the downward 
selected line. At this time cross-veinless flies began to appear in small numbers among 
untreated flies of the upward-selected line, and by generation 16 the spontaneous incidence 



Fig. 18.3. Diagram illustrating the genetic assimilation described in Example 18.3. The 
distributions of liability are marked in standard deviations from the original population mean. 
The vertical lines show the positions of the induced and spontaneous thresholds, and the arrows 
mark the population means at the following three stages of selection. 

(a) before selection: incidence - induced = 30%, spontaneous = 0% 

(b) after some selection: incidence - induced = 80%, spontaneous = 2% 

(c) after further selection: incidence - induced = 100%, spontaneous = 95% 
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was between 1 and 2 per cent. Selection was then continued without treatment, the popula¬ 
tion being subdivided into a number of lines. The best four of the lines, selected without 
further treatment, reached spontaneous incidences ranging from 67 per cent to 95 per 
cent. The distributions in Fig. 18.3 illustrate the progress of the upward selection. Graph 

( b ) shows a spontaneous incidence of 2 per cent and an induced incidence of 80 per cent 
and thus corresponds approximately with generation 16. On the assumption of constant 
variance, the change of mean at this stage amounted to 1.36 standard deviations. Graph 

(c) shows a spontaneous incidence of 95 per cent and represents the line that finally showed 
the greatest progress. Its mean liability is 5.15 standard deviations above that of the initial 
population. 

Problems 

18.1 A family study of congenital dislocation of the hip (human) gave the follow¬ 
ing results. There were altogether 397 ‘index patients’ whose relatives were studied. 
The first-degree relatives were mostly parents and full sibs, the second-degree mostly 
grand-parents, uncles, aunts, nephews and nieces, and the third-degree were all first 
cousins. The numbers of relatives examined and the numbers affected with the malfor¬ 
mation were as follows. 



Affected 

Total 

1st degree 

35 

1,777 

2nd degree 

16 

4,746 

3rd degree 

8 

4,220 


The incidence in the population as a whole was about 1 per 1,000. From these data 
evaluate (approximately) the correlation of relatives of the three sorts with respect 
to liability, and estimate the heritability of liability. Which estimate of the heritability 
would you think likely to be the most reliable? 

Data from Wynne-Davies (1970) pp. 316—38 in Emery, A.E.H. (ed.), Modern 
Trends in Human Genetics — 1. Butterworths, London. 


[Solution 94] 

18.2 The data below come, somewhat simplified, from an anlysis of twinning in 
Swedish Friesian cattle. Cows (or heifers) which had one twin birth were picked 
at random from the records. The proportions of their mothers and of their daughters 
that had twins in their fourth, fifth, or sixth calvings were then found. These in¬ 
cidences and the incidence in the breed as a whole are given below. Of those cows 
which had twins at their first calving, the proportion that had twins in a later calving 
is given in the last line of the table. Calculate approximately the heritability and 
the repeatability of liability to produce twins. 

Population 3.5% 

Mothers 4.6% 

Daughters 4.8% 

Repeat calvings 10.0% 
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Data from Johansson, I. et al. (1974) Hereditas, 78, 201-34. 


[Solution 104] 

18.3 An alternative way of analysing twinning is to treat it as ‘litter size’, individuals 
having a value 1 if they have single calves or 2 if they have twins. When analysed 
in this way the correlation of half sibs in the Swedish Friesians was found to be 
0.0058. Is this consistent with the heritability of liability calculated in Problem 18.2? 

[Solution 114] 

18.4 Fleece-rot is a damaging condition in Australian Merino sheep, associated 
with wet weather. There is, however, genetic variation in susceptibility. The herit¬ 
ability of liability was estimated as 0.3 in a flock where the incidence was 23 per 
cent. Reducing the incidence is obviously desirable; increasing the incidence might 
be useful in an experimental flock. What would be the incidences after two genera¬ 
tions of two-way selection, for reduced and for increased incidence, if 10 per cent 
of males and 50 per cent of females were selected on an individual basis? 

Data kindly supplied by Kevin D. Atkins. 


[Solution 124] 

18.5 A form of polydactyly (extra fingers and toes) appeared in a strain of mice 
under selection for large size. At first it was at low frequency and only the hind 
feet were affected. Breeding from the affected individuals over a few generations 
increased the frequency and resulted in the appearance of individuals with hind and 
fore feet affected. There were thus three phenotypic classes: normal (AO, hind feet 
only affected (H), and both hind and fore feet affected (F). The frequencies after 
5 generations of selection were 

N H F 

20% 50% 30% 

The progeny of different types of mating provided good evidence that H was 
intermediate in liability between N and F. 

Calculate (1) the difference in liability between the two thresholds, in standard 
deviation units, (2) the mean of the population in threshold units, as a deviation from 
the threshold separating N from H, (3) the mean liability of each of the three 
phenotypes in threshold units as deviations from the population mean. 

Data from Roberts, R.C. & Mendell, N.R. (1975) Genet. Res., 191, 427-44. 

[Solution 134] 
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This chapter deals with the relationships between two metric characters, in particular 
with characters whose values are correlated — either positively or negatively — in 
the individuals of a population. Correlated characters are of interest for three chief 
reasons. Firstly, in connection with the genetic causes of correlation through the 
pleiotropic action of genes: pleiotropy is a common property of major genes, but 
we have as yet had little occasion to consider its effects in quantitative genetics. 
Secondly, in connection with the changes brought about by selection: it is important 
to know how the improvement of one character will cause simultaneous changes 
in other characters. And thirdly, in connection with natural selection: the relation¬ 
ship between a metric character and fitness is the primary agent that determines the 
genetic properties of that character in a natural population. This last point, however, 
will be discussed in the next chapter. 

Genetic and environmental correlations 

In genetic studies it is necessary to distinguish two causes of correlation between 
characters, genetic and environmental. The genetic cause of correlation is chiefly 
pleiotropy, though linkage is a cause of transient correlation, particularly in popula¬ 
tions derived from crosses between divergent strains. Pleiotropy is simply the pro¬ 
perty of a gene whereby it affects two or more characters, so that if the gene is 
segregating it causes simultaneous variation in the characters it affects. For exam¬ 
ple, genes that increase growth rate increase both stature and weight, so that they 
tend to cause correlation between these two characters. Genes that increase fatness, 
however, influence weight without affecting stature, and are therefore not a cause 
of correlation. The degree of correlation arising from pleiotropy expresses the extent 
to which two characters are influenced by the same genes. But the correlation resulting 
from pleiotropy is the overall, or net, effect of all the segregating genes that affect 
both characters. Some genes may increase both characters, while others increase 
one and reduce the other; the former tend to cause a positive correlation, the latter 
a negative one. So pleiotropy does not necessarily cause a detectable correlation. 
The environment is a cause of correlation in so far as two characters are influenced 
by the same differences of environmental conditions. Again, the correlation resulting 
from environmental causes is the overall effect of all the environmental factors that 
vary; some may tend to cause a positive correlation, others a negative one. 
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The association between two characters that can be directly observed is the cor¬ 
relation of phenotypic values, or the phenotypic correlation. This is determined from 
measurements of the two characters in a number of individuals of the population. 
Suppose, however, that we knew not only the phenotypic values of the individuals 
measured, but also their genotypic values and their environmental deviations for 
both characters. We could then compute the correlation between the genotypic values 
of the two characters and the correlation between the environmental deviations, and 
so assess independently the genetic and environmental causes of correlation. And 
if, in addition, we knew the breeding values of the individuals, we could determine 
also the correlation of breeding values. In principle there are also correlations be¬ 
tween dominance deviations, and between the various interaction deviations. To deal 
with all these correlations would be unmanageably complex but fortunately is not 
necessary since the practical problems can be quite adequately dealt with in terms 
of two correlations. These are the genetic correlation, which is the correlation of 
breeding values, and the environmental correlation, which is not strictly speaking 
the correlation of environmental deviations, but the correlation of environmental 
deviations together with non-additive genetic deviations. In other words, just as the 
partitioning of the variance of one character into two components, additive genetic 
versus all the rest, was adequate for many purposes, so now the covariance of two 
characters need only be partitioned into these same two components. The ‘genetic’ 
and ‘environmental’ correlations thus correspond to the partitioning of the covariance 
into the additive genetic component versus all the rest. The methods of estimating 
these two correlations will be explained later. The first problem to be considered 
is how the genetic and environmental correlations combine together to give the directly 
observable phenotypic correlation. 

The following symbols will be used throughout this chapter: 


X and Y: 

rp 

r A 




cov 

a 2 and a 


h 2 
-2 


the two characters under consideration. 

the phenotypic correlation between the two characters X and Y. 
the genetic correlation between X and Y (i.e., the correlation of 
breeding values). 

the environmental correlation between X and Y (including non-additive 
genetic effects). 

the covariance of the two characters X and Y, with subscripts P, A, 
or E, having the same meaning as for the correlations, 
variance and standard deviation, with subscripts P, A, or E, as above, 
and X or Y according to the character referred to; e.g. o 2 PX = 
phenotypic variance of character X. 

the heritability, with subscript X or Y, according to the character. 
= 1 - h 2 . 


(The symbol r G is often used for the genetic correlation but, since the correlation 
referred to is almost always the correlation of breeding values, the symbol r A will 
be used here for the sake of consistency with previous chapters.) 

A correlation, whatever its nature, is the ratio of the appropriate covariance to 
the product of the two standard deviations. For example, the phenotypic correlation is 

co v P 


r P = 


Op X OpY 
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and the phenotypic covariance can be written as 

COV p = VpOpyOpy 

The phenotypic covariance is the sum of the genetic and environmental covariances, 
i.e., 

COVp = cov^ + cov £ 

Writing these covariances in terms of the correlations and standard deviations as 
above gives 

rpOp X <JpY = t~A 0 AX a AY + r E a EX°EY 

Now note that o A = ha P and o E = eo P . Substituting these gives 

r P a PX°PY = r A hxOpxhyOpY + r E e X a PX e Y a PY 


Dividing through by o px opy leads to 

r P = h x h y r A + e x e y r E .. . [19.1] 

This shows how the genetic and environmental causes of correlation combine together 
to give the phenotypic correlation. If both characters have low heritabilities, then 
the phenotypic correlation is determined chiefly by the environmental correlation: 
if they have high heritabilities, then the genetic correlation is the more important. 
The dual nature of the phenotypic correlation makes it clear that the magnitude and 
even the sign of the genetic correlation cannot be determined from the phenotypic 
correlation alone. 

A few examples of genetic and environmental correlations are given in Table 19.1. 


Table 19.1 Some examples of phenotypic, genetic, and environmental correlations. 
The estimates quoted refer to particular populations in particular circumstances; they 
should not be taken as generally applicable. 


Man (Grundbacher, 1974) 

Serum immunoglobulin levels, IgG: IgM 
Cattle (Barker and Robertson, 1966) 

Milk-yield: butterfat % (1st lactation) 

Milk yield in 1st: 2nd lactations 
Pigs (Smith, King, and Gilbert, 1962) 

Weight gain: backfat thickness 
Weight gain: efficiency 

Poultry (Emsley, Dickerson, and Kashyap, 1977) 
Body weight: egg weight 
Body weight: egg production 
Egg weight: egg production 
Mice (Rutledge, Eisen, and Legates, 1973) 

Body weight: tail length 

Drosophila melanogaster (Sheridan et al., 1968) 
Bristle number, abdominal: sternopleural 


r p 

r A 

r E 

0.20 

0.07 

0.31 

-0.26 

-0.38 

-0.18 

0.40 

0.75 

0.26 

0.00 

0.13 

-0.18 

0.66 

0.69 

0.64 

0.33 

0.42 

0.23 

0.01 

-0.17 

0.08 

-0.05 

-0.31 

0.02 

0.45 

0.29 

0.56 

0.14 

0.41 

0.06 
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In some cases the genetic and environmental correlations are different in magnitude, 
or even in sign. In other cases the two correlations are of the same sign and not 
very different magnitude, and this is the more usual situation. A large difference, 
and particularly a difference of sign, shows that genetic and environmental sources 
of variation affect the characters through different physiological mechanisms. 

The genetic correlation expresses the extent to which two measurements reflect 
what is genetically the same character. For example, the lengths of the two wings 
of Drosophila must obviously be measures of the same character, wing length. But 
wing length and thorax length, though both measures of body size, are not quite 
the same character; the genetic correlation between them is about 0.75 (Reeve and 
Robertson, 1953). In this connection the genetic correlation has a bearing on the 
interpretation of the repeatability of multiple measurements. In Chapter 8 it was said, 
without explanation, that the repeatability has a precise genetic interpretation only 
if the different measurements are of the same genetic character. The meaning of 
this requirement can now be seen to be that the genetic correlation between the 
measurements must be 1. If the two characters X and Y in equation [19.1] are the 
same, the expression for the phenotypic correlation reduces to r P = h 2 + e 2 , which 
is equivalent to the repeatability in equation [8.12], since here e^ is the proportion 
of variance due to the general environment ( V Eg ) and the non-additive genetic 
variance. The repeatability nevertheless remains a useful concept even though the 
genetic correlation may often be somewhat less than 1. 

There are some pairs of characters for which no phenotypic correlation exists. 
These are characters that cannot both be measured on the same individual. The age 
at sexual maturity in males and in females is an example of two such characters. 
Though no phenotypic correlation can be measured, the two characters may never¬ 
theless be correlated genetically and environmentally, and both these correlations 
can be estimated. 


Estimation of the genetic correlation 

The estimation of genetic correlations rests on the resemblance between relatives 
in a manner analogous to the estimation of heritabilities described in Chapter 10. 
Therefore only the principle and not the details of the procedure need be described 
here. Instead of computing the components of variance of one character from au 
analysis of variance, we compute the components of covariance of the two characters 
from an analysis of covariance which takes exactly the same form as the analysis 
of variance. Instead of starting from the squares of the individual values and parti¬ 
tioning the sums of squares according to the source of variation, we start from the 
product of the values of the two characters in each individual and partition the sums 
of products according to the source of variation. This leads to estimates of the obser¬ 
vational components of covariance, whose interpretation in terms of causal com¬ 
ponents of covariance is exactly the same as that of the components of variance given 
in Table 10.4. Thus, in an analysis of half-sib families the component of covariance 
between sires estimates icov^, i.e., one-quarter of the covariance of breeding 
values of the two characters. For the estimation of the correlation, the components 
of variance of each character are also needed. Thus the between-sire components 
of variance estimate 4 a ax an d *°ay- Therefore the genetic correlation is obtained 
as 
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r A = 


covxy 

V (var x var Y ) 


.. . [19.2] 


where var and cov refer to the components of variance and covariance. 

The offspring—parent relationship can also be used for estimating the genetic cor¬ 
relation. To estimate the heritability of one character from the resemblance between 
offspring and parents, we compute the covariance of offspring and parent for the 
one character by taking the product of the parent or mid-parent value and the mean 
value of the offspring. To estimate the genetic correlation between two characters 
we compute what might be called the ‘cross-covariance’, obtained from the product 
of the value of X in parents and the value of Y in offspring. This ‘cross-covariance’ 
is half the genetic covariance of the two characters, i.e., icov^. The covariances 
of offspring and parents for each of the characters separately are also needed, and 
then the genetic correlation is given by 


r A = 


covxy 

V(cov xx cov YY ) 


. .. [19.3] 


where cov XY is the ‘cross-covariance’, and cov xx and cov YY are the offspring- 
parent covariances of each character separately. The cross-covariance can be 
calculated from X in parents and Y in offspring or from Y in parents and X in off¬ 
spring. If both are available the arithmetic mean should be used. The genetic cor¬ 
relation can also be estimated from responses to selection in a manner analogous 
to the estimation of realised heritability. This will be explained in the next section. 

Data that provide estimates of genetic correlations provide also estimates of the 
heritabilities of the correlated characters, and of the phenotypic correlations. The 
environmental correlation can then be found from equation [19.1]. If highly inbred 
lines are available, the environmental correlations can be estimated directly from 
the phenotypic correlation within the lines, or preferably within the Fj’s of crosses 
between the lines. 

Estimates of genetic correlations are usually subject to rather large sampling errors 
and are therefore seldom very precise. Furthermore, genetic correlations are strongly 
influenced by gene frequencies (Bohren, Hill, and Robertson, 1966), so they may 
differ markedly in different populations. For these reasons the examples quoted in 
Table 19.1 must be regarded as approximate values and not necessarily valid for 
other populations. The sampling variance of a genetic correlation is a complicated 
matter. (For details see Van Vleck and Henderson, 1961; Hammond and Nicholas, 
1972.) An approximate formula for the standard error, derived from Reeve (1955 b) 
and Robertson (1959/7) is 


a (r A ) 


1 - ^ 
V2 


°(h\) a (h\) 

h 2 x h\ 


... [19.4] 


where a denotes standard error. This refers to an estimate based on both cross¬ 
covariances; if only one is used the V2 is to be omitted. Since the standard errors 
of the two heritabilities appear in the numerator, an experiment designed to minimize 
the sampling variance of an estimate of heritability, in the manner described in Chapter 
10, will also have the optimal design for the estimation of a genetic correlation. 
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Correlated response to selection 

The next problem for consideration concerns the response to selection: if we select 
for character X, what will be the change of the correlated character Y? The expected 
response of a character Y, when selection is applied to another character X, may 
be deduced in the following way. The response of character X — i.e. the character 
directly selected — is equivalent to the mean breeding value of the selected individuals. 
This was explained in Chapter 11. The consequent change of character Y is therefore 
given by the regression of the breeding value of Y on the breeding value of X. This 
regression is 


? (/4)YX 


COV4 

Oax 


a AY 

a AX 


The response of character X, directly selected, by equation [11.4], is 


*x - ihx a Ax 

Therefore the correlated response of character Y is 
CR y = b iA)YX R x 


[19.5*3] 


■i ct*y 

= l hx°AX r A - 

°AX 

= ih x r A o AY • • - [19.56] 

Or, by putting a AY = h Y a PY , the correlated response becomes 

CR y = ih x h Y r A a PY ... [19.6] 

Thus the response of a correlated character can be predicted if the genetic cor¬ 
relation and the heritabilities of the two characters are known. And, conversely, 
if the correlated response is measured by experiment, and the two heritabilities are 
known, the genetic correlation can be estimated. If the heritability of character Y 
is to be estimated as the realized heritability from the response to selection, then 
it is necessary to do a double selection experiment. Character X is selected in one 
line and character Y in another. Then both the direct and the correlated responses 
of each character can be measured. This type of experiment provides two estimates 
of the genetic correlation (by equation [19.6]), one from the correlated response 
of each character; and the two estimates should agree if the theory of correlated 
responses expressed in equation [19.6] adequately describes the observed responses. 
A joint estimate of the genetic correlation can be obtained from such double selec¬ 
tion experiments, without the need for estimates of the heritabilities, from the follow¬ 
ing formula which may be easily derived from equations [11.4] and [19.56]: 


CR X CR y 
R x Ry 


■ .. [19.7] 


Example 19.1 In a study of wing length and thorax length in Drosophila melanogaster. 
Reeve and Robertson (1953) estimated the genetic correlation between these two measures 
of body size from the responses to selection. There were two pairs of selection lines; 
one pair was selected for increased and for decreased thorax length, and the other pair 
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for increased and for decreased wing length. In each line the correlated response of the 
character not directly selected was measured, as well as the response of the character 
directly selected. Two estimates of the genetic correlation were obtained by equation 
[19.7], one from the responses to upward selection and the other from the responses 
to downward selection. In addition, estimates of the genetic correlation in the unselected 
population were obtained from the offspring-parent covariance and also from the full- 
sib covariance. The four estimates were as follows: 


Method Genetic correlation 

Offspring—parent 0.74 

Full sib 0.75 

Selection, upward 0.71 

Selection, downward 0.73 

The agreement between the estimates from selection and the estimates from the unselected 
population shows that the correlated responses were very close to what would have been 
predicted from the genetic analysis of the unselected population. 

Close agreement between observed and predicted correlated responses, such as 
was shown in the above example, cannot always be expected and, indeed, is not 
often found, particularly if the genetic correlation is low. Furthermore, double selec¬ 
tion experiments are often inconsistent in the estimates of the genetic correlation 
that they give. There are two reasons for the low predictability and the inconsist¬ 
ency of correlated responses. The first is the low precision of estimates of the genetic 
correlation in the base population, resulting from the large sampling errors already 
mentioned. The second reason is the sensitivity of genetic correlations to gene fre¬ 
quency changes (Bohren, Hill, and Robertson, 1966); the genetic correlation, and 
therefore the correlated response, can change rapidly during the course of the selec¬ 
tion as a result of the selection itself and of random drift. For these reasons there 
must be some lack of confidence in applying the theory of correlated responses in 
practice. We shall, however, pursue the practical implications of the theory a little 
further in the next section, but with the caution that the theory cannot always be 
relied on to work well in practice. 

Correlated selection differential. When selection is applied to one character X, any 
phenotypically correlated character Y will have a correlated , or apparent, selection 
differential on it. In other words, the individuals selected for X will have a mean 
value of Y that is different from the population mean. At first sight it might seem 
that some use could be made of this correlated selection differential for predicting 
the correlated response, or for estimating the heritability of the correlated character, 
in a manner analogous to equation [11.7] ( h 2 = R/S). Unfortunately, however, the 
correlated selection differential is of no use for either of these purposes. The reason 
is briefly as follows. Let Sy be the correlated selection differential on Y. Sy = 
b (P) yxSx- Writing the correlated response in the form of equation [19.5a] gives 

CRy _ b iA) R x 
Sy b iF) S x 

Substituting b = roy!a x and R x /S x — h x leads to 


... [19.8a] 
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cr y 

Sy 


r A . , 

“X*Y 

rp 


[19.8*] 


Thus, without knowing the genetic correlation r A , it is not possible to use the cor¬ 
related selection differential S Y , either to estimate the heritability of character Y 
or to predict the correlated response. Equation [19.8] can also be written in the form 


CRy = CO\ (A) 

iSy COV^p) 

which is analogous to the direct response, R/S = V A /V P . 


[19.8c] 


Indirect selection 

Consideration of correlated responses suggests that it might sometimes be possible 
to achieve more rapid progress under selection for a correlated response than from 
selection for the desired character itself. In other words, if we want to improve 
character X, we might select for another character Y, and achieve progress through 
the correlated response of character X. We shall refer to this as indirect selection ; 
that is to say, selection applied to some character other than the one it is desired 
to improve. And we shall refer to the character to which selection is applied as the 
secondary character. The conditions under which indirect selection would be 
advantageous are readily deduced. Let R x be the direct response of the desired 
character, if selection were applied directly to it; and let CR X be the correlated 
response of character X resulting from selection applied to the secondary character 
Y. The merit of indirect selection relative to that of direct selection may then be 
expressed as the ratio of the expected responses, CR x tR x . Taking the expected cor¬ 
related response from equation [19.5*] and the expected direct response from equa¬ 
tion [11.4], we find 


CRx _ iyhyr A o AX 
Rx ixhx°AX 


iyr A hy 

*x*x 


... [19.9] 


It can be seen from this expression that indirect selection will be better than direct 
selection if r A h Y is greater than h x . These two quantities are the accuracies of the 
two selection procedures; r A h Y is the correlation between breeding values of the 
desired character X and phenotypic values of the selected character Y, while h is 
the accuracy of individual selection (i.e. direct selection) as explained in Chapter 
13. Thus indirect selection cannot be expected to be better than direct selection unless 
the secondary character has a substantially higher heritability and the genetic cor¬ 
relation is high. There are, however, practical considerations that may make indirect 
selection preferable. Three such practical matters may be mentioned. 

1. If the desired character is difficult to measure with precision, the errors of 
measurement may so reduce the heritability that indirect selection becomes 
advantageous. 

2. If the desired character is measurable in one sex only, but the secondary character 
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is measurable in both, then a higher intensity of selection will be possible by indirect 
selection. Other things being equal, the intensity of selection would be twice as great 
by indirect as by direct selection; but a better plan would be to select one sex direct¬ 
ly for the desired character and the other indirectly for the secondary character. 

3. The desired character may be costly to measure, as for example the efficiency 
of food-conversion. Then it may be economically better to select for an easily 
measured correlated character, such as growth rate. 

For a detailed evaluation of indirect selection, see Searle (1965). The following 
is an example of indirect selection giving a better response than direct selection for 
a character measurable in only one sex. 

Example 19.2 {Data from Nagai et al., 1978). Mice were selected for two characters, 
nursing ability of females measured as the 12-day weight of the litter, and 6-week weight 
of individuals. Nursing ability will be designated as N and 6-week weight as W. The 
selection was done in two populations, P and Q, with different origins, and was con¬ 
tinued for 12 generations. In each population one line was selected upwards for N, giv¬ 
ing the direct response of N and the correlated response of W; a second line was selected 
for W giving the direct response of W and the correlated response of N. The responses 
are given in the table, correlated responses being shown in italics; all are in units of 
grams per generation. 


Population 

Character selected 

N 

P 

Response* of N 

0.080 


Response* of W 
Observed CR N //i N 

0.197 

1.675 

Realized h 2 

Realized r A 

0.16 

0.70 

^W r A^N 


1.11 

Expected 4- C7? N /R N 


2.2 


Q 


W 

N 


W 

0.134 

0.054 


0.125 

0.680 

0.198 

2.315 

0.868 

0.40 

0.11 


0.43 


0.73 

1,44 

2.9 


* Correlated responses in italics. 

+ Assuming / W // N = 2. 

The genetic correlation in the P population is calculated by equation [19.7] as follows: 




C/? N C7? w 
R \t Rv 


0.134 

0.080 


0.197 

0.680 


= 0.485; r A = 0.70 


i N 

Similarly in the Q population, r A = 0.73. There was thus very good agreement between 
the two populations in the estimation of r A . 

As can be seen from the ratio CR/R in the table, indirect selection was substantially 
better than direct selection for improving nursing ability. The reasons for this are that 
W has a higher heritability than N and can be selected in both sexes. The heritabilities, 
estimated as realized heritabilities from the direct responses, are given in the table. Again 
the two populations show good agreement for both characters. The intensities of selec¬ 
tion actually applied are not given, but the expected intensities can be deduced from 
the proportions selected. The same proportions were selected for both characters, but 
females only were selected for nursing ability while both sexes were selected for weight. 
The net intensity of selection was therefore expected to be twice as great for weight as 
for nursing ability. 

The expected ratio of correlated to direct responses of nursing ability can now be 
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calculated by equation [19.9] from the observed heritabilities and genetic correlation 
and the presumed intensities of selection. For population P it is 




*'vA 

i«ih\ 


= 0.70 x 2 x 


0.40 


= 2.2 


V N ( N“N \ 0.16 

The ratio for population Q is 2.9. In both cases the ratio realized was somewhat less 
than the expectation, presumably because the intensities of selection realized were not 
as much as twice as great for weight as for nursing ability. 


Though indirect selection has been presented above as an alternative to direct selec¬ 
tion, the most effective method in theory is neither one nor the other but a combina¬ 
tion of the two. The most effective use that can be made of a correlated character 
is in combination with the desired character, as an additional source of information 
about the breeding values of individuals. This, however, is a special case of a more 
general problem which will be dealt with in the final section of this chapter. First 
we shall show how the idea of indirect selection can be extended to cover selection 
in different environments. 


Genotype—environment interaction 

The concept of genetic correlation can be applied to the solution of some problems 
connected with the interaction of genotype with environment. The meaning of inter¬ 
action between genotype and environment was explained in Chapter 8, where it was 
discussed as a source of variation of phenotypic values, which in most analyses is 
inseparable from the environmental variance. The chief problem which it raises, 
and which we are now in a position to discuss, concerns adaptation to local condi¬ 
tions. The existence of genotype—environment interaction may mean that the best 
genotype in one environment is not the best in another environment. It is obvious, 
for example, that the breed of cattle with the highest milk-yield in temperate climates 
is unlikely also to have the highest yield in tropical climates. But it is not so obvious 
whether smaller differences of environmental conditions also require locally adapted 
breeds; nor is it intuitively obvious how much of the improvement made in one 
environment will be carried over if the breed is then transferred to another environ¬ 
ment. These matters have an important bearing on breeding policy. If selection is 
made under good conditions of feeding and management on the best farms and 
experimental stations, will the improvement achieved be carried over when the later 
generations are transferred to poorer conditions? Or would the selection be better 
done in the poorer conditions under which the majority of animals are required to 
live? The idea of genetic correlation provides the basis for a solution of these prob¬ 
lems in the following way. 

A character measured in two different environments is to be regarded not as one 
character but as two. The physiological mechanisms are to some extent different, 
and consequently the genes required for high performance are to some extent also 
different. For example, growth rate on a low plane of nutrition may be principally 
a matter of efficiency of food-utilization, whereas on a high plane of nutrition it 
may be principally a matter of appetite. By regarding performance in different 
environments as different characters with genetic correlation between them, we can 
in principle solve the problems outlined above from a knowledge of heritabilities 
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of the different characters and the genetic correlations between them. If the genetic 
correlation is high, then performance in two different environments represents very 
nearly the same character, determined by very nearly the same set of genes. If it 
is low, then the characters are to a great extent different, and high performance 
requires a different set of genes. Here we shall consider only two environments, 
but the idea can be extended to an indefinite number of different environments 
(Robertson, 19596; Dickerson, 1962; Yamada, 1962). 

Let us consider the problem of the ‘carry-over’ of the improvement from one 
environment to another. Suppose that we select for character X say growth rate 
on a high plane of nutrition — and we look for improvement in character Y — say 
growth rate on a low plane of nutrition. The improvement of character Y is simply 
a correlated response, and the expected rate of improvement was given in equation 
[19.6] as 

CRy = ihxh\ r AO py 

The improvement of performance in an environment different from the one in which 
selection was carried out can therefore be predicted from a knowledge of the herit- 
ability of performance in each environment and the genetic correlation between the 
two performances. We can also compare the improvement expected by this means 
with that expected if we had selected directly for character Y, i.e., for performance 
in the environment for which improvement is wanted. This is simply a comparison 
of indirect with direct selection, which was explained in the previous section. The 
comparison is made from the ratio of the two expected responses given in equation 
[19.9], i.e., 

CRy _ ix^x 

/?Y iyhy 

This shows how much we may expect to gain or lose by carrying out the selection 
in some environment other than the one in which the improved population is required 
to live. If we assume that the intensity of selection is not affected by the environment 
in which the selection is carried out, then the indirect method will be better if r A h x 
is greater than h x , where h x is the square-root of the heritability in the environ¬ 
ment in which selection is made, and h x is the square-root of the heritability in the 
environment in which the population is required subsequently to live. If the genetic 
correlation is high, then the two characters can be regarded as being substantially 
the same; and if there are no special circumstances affecting the heritability or the 
intensity of selection, it will make little difference in which environment the selec¬ 
tion is carried out. But if the genetic correlation is low, then it will be advantageous 
to carry out the selection in the environment in which the population is destined 
to live, unless the heritability or the intensity of selection in the other environment 
is very considerably higher. 

This is the theoretical basis for dealing with selection in different environments. 
There have been several experiments testing the theory. In general they confirm 
the theory in finding correlated responses to be smaller than direct responses, i.e., 
selection is most effective if carried out in the environment for which the improve¬ 
ment is sought. These experiments, however, are not free of the inconsistencies men¬ 
tioned earlier, particularly inconsistencies between the responses to upward and to 
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downward selection. An experiment with mice is described briefly in the following 
example; for other examples, see two experiments with Tribolium in which the 
characters were larval growth (Yamada and Bell, 1969) and rate of egg-laying 
(Orozco, 1976). 

Example 19.3 (Data from Falconer, 19606). Mice were selected for growth from 3 
to 6 weeks on two diets: ‘good’ which was the normal diet, and ‘bad’ which was the 
normal diet diluted with 50 per cent indigestible fibre. The bad diet reduced growth by 
about 20 per cent at the beginning of the experiment. The direct and correlated responses 
were measured in each generation, the direct responses from first-litter progeny grown 
on the diet of selection, and the correlated responses from second-litter progeny grown 
on the other diet. Selection was carried out in both directions. There were inconsisten¬ 
cies between selection in opposite directions and between the earlier and later genera¬ 
tions. For the purpose of illustrating the theory, these inconsistencies are avoided by 
taking the results over the first four generations only, and expressing the responses as 
the divergence between upward and downward selection. The table gives the informa¬ 
tion needed to calculate the genetic correlation from each pair of lines separately by equa¬ 
tion [19.9] and from both pairs of lines together by equation [19.7], The responses are 
grams per generation. As expected, both correlated responses were less than the direct 
responses, the ratios between the two indicating a genetic correlation of 0.66. 

Divergence (g) per generation to generation 4. 


Character 


Growth on Growth on 

good diet bad diet 


Intensity of selection, / 

1.66 


1.40 

Realized heritability, h 2 

0.41 


0.36 

Direct response, R 

0.90 


1.20 

Correlated response, CR 

0.48 


0.98 

Genetic correlation, by eqn [19.9], r A 

Genetic correlation, by eqn [19.7], r A 

0.67 

0.66 

0.65 


Environmental sensitivity. The way in which genotype—environment interaction 
arises from differences in sensitivity to the environment was explained in Chapter 
8. The genetic correlation provides a means of quantifying the interaction for the 
purpose of predicting responses to selection. Understanding responses to selection 
in different environments may, however, be helped by thinking about environmen¬ 
tal sensitivity. A high genetic correlation means that all genotypes react in nearly 
the same way to environmental differences; a plot like that of Fig. 8.2 would have 
regression lines that were all nearly parallel. A low genetic correlation means that 
genotypes react differently and have regression lines with different slopes, i.e., 
individuals have different environmental sensitivities. How does selection act on these 
differences of sensitivity? It is convenient to refer to environments as ‘good’ or ‘bad’ 
according to whether they increase or decrease the character; in practice an increase 
is generally sought, so an environment that increases the character is ‘good’. The 
effect of selection on sensitivity can be seen from Fig. 8.2. Upward selection in 
a good environment tends to pick individuals with high sensitivity, and downward 
selection in a bad environment does the same. In contrast, upward selection in a 
bad environment and downward selection in a good environment tend to pick 
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individuals with low sensitivity. In other words, high sensitivity will be selected 
for when the selection and the environment act on the character in the same direc¬ 
tion, and low sensitivity will be selected for when selection and environment act 
in opposite directions. These expected changes of environmental sensitivity have 
been confirmed by several experiments: with the fungus Schizophyllum commune 
by Jinks and Connolly (1973, 1975); with the tobacco Nicotiana rustica by Jinks 
and Pooni (1982); with mice in the experiment of Example 19.3 as described in 
Example 19.4 below; and by a similar experiment with mice by Nielsen and Andersen 
(1987). 

What is wanted in practice is often not performance in a specific environment 
but performance in a range of environments, both good and bad, i.e., good average 
performance in different environments. Individuals cannot usually be measured in 
more than one environment, so selection for average performance has to be family 
selection with the families divided between the environments. Details of how the 
two phenotypes of a family should be combined in an index to give the maximum 
improvement of average performance are given by James (1961). If measurements 
can only be made in one environment, consideration of environmental sensitivity 
suggests that the best average performance will be achieved by selecting in an 
environment that acts in the direction opposite to the selection, i.e. by upward selec¬ 
tion in a bad environment or, if reduction of the character is desired, by downward 
selection in an environment that increases the character (Jinks and Connolly, 1973). 
This expectation was borne out by the mouse experiment in Examples 19.3 and 19.4 
and, partially at least, by the Tribolium experiment of Yamada and Bell (1969). 

Example 19.4 The mouse experiment of Example 19.3 provided data on environmen¬ 
tal sensitivity and overall performance. In generation 7 an unselected control was measured 
and the responses in the two directions are given in the table as deviations from the con¬ 
trol. The correlated responses are in italics. The environmental sensitivity of each line 
is the difference between its growth on the two diets, shown under ‘Effect of diet'. As 
expected, the most sensitive lines are those selected upwards on the good diet and 
downwards on the bad diet. Also as expected, the lines showing the best overall or average 
performance, shown under ‘Mean of both diets’, are the one selected upwards on the 
bad diet (where high growth was the objective) and the one selected downwards on the 
good diet (where low growth was the objective). 


Total response (g) to generation 7 as deviations from controls. 


Selection 


Response 



Sensitivity 

Direction 

Diet 

Growth on 
good diet 

Growth on 
bad diet 

Mean of 
both diets 

Effect 
of diet 

Up 

good 

2.3 

0.6 

1.45 

5.4 

Up 

bad 

1.6 

3.1 

2.35 

3.5 

Down 

good 

-2.8 

-2.9 

-2.85 

3.6 

Down 

bad 

-7.2 

-3.2 

-2.20 

6.8 


Index selection 

When selection is applied to the improvement of the economic value of animals or 
plants, it is generally applied to several characters simultaneously and not just to 
one, because economic value depends on more than one character. This is usually 
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referred to as multiple trait selection. For example, the profit made from a herd 
of pigs depends on their fertility, mothering ability, growth rate, efficiency of food- 
utilization, and carcass qualities. How, then, should selection be applied to the com¬ 
ponent characters in order to achieve the maximum improvement of economic value? 
There are several possible procedures. One might select in turn for each character 
singly in successive generations ( tandem selection ); or one might select for all the 
characters at the same time but independently, rejecting all individuals that fail to 
come up to a certain standard for each character regardless of their values for any 
other of the characters {independent culling levels). The method that is expected 
to give the most rapid improvement of economic value, however, is to apply the 
selection simultaneously to all the component characters together, appropriate weight 
being given to each character according to its relative economic importance, its 
heritability, and the genetic and phenotypic correlations between the different 
characters. The practice of selection for economic value is thus a matter of some 
complexity. The component characters have to be combined together into a score, 
or index , in such a way that selection applied to the index, as if the index were a 
single character, will yield the most rapid possible improvement of economic value. 

The principles of index selection were introduced in Chapter 13 and will not be 
repeated in full here. The main difference in the index required here is that the 
breeding value to be predicted is not that of a single character but that of a com¬ 
posite of several characters evaluated in economic terms. The index is consequently 
more complex than the one developed in Chapter 13. We shall, however, start by 
considering the simpler problem of improving a single character by the use of an 
index, and then extend this to improving economic value. 

Construction of the index 

The objective of the selection, whatever it may be, will be referred to as merit, and 
the breeding value for merit will be symbolized by H. The index to be constructed 
for the improvement of merit is, as before, 

/ = b x P x + b 2 P 2 + ... + b m P m . . . [19.10] 

where P, to P m are phenotypic measurements of m characters on which selection 
is to be based, and b x to b m are the corresponding weighting factors to be deter¬ 
mined. The b' s are partial regression coefficients of H on /. Information from relatives 
can be included in the index, so the P's can be measurements of relatives in the 
manner explained in Chapter 13. 

Single trait. First consider selection aimed at improving just one character. The 
purpose of applying index selection is then to use secondary characters as aids to 
improvement of the one desired character. The index equations, whose solution gives 
the values of the b's in the index, are exactly the same as equations [13.10], with 
character 1 as the character to be improved. 

b x P u + b 2 P[2 + •■• + b m P\ m 
b\P 2 \ + b 2 P 22 + ■ ■ - + b m P 2m 


b\P m \ + b 2 Pm2 + ■ • • + b m P mm = A m \ 


^ii 

A 2] 


... [19.11] 
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The notation here is abbreviated as in Chapter 13. For example, P u is the 
phenotypic variance of character 1 , and P \2 is the phenotypic covariance of 
characters 1 and 2; A n and A n are similarly the additive genetic variance and 
covariance. The variances and covariances can be expressed in terms of the 
heritabilities and correlations as follows, where the subscripts i and j refer to any 
two different characters and a 2 is the phenotypic variance: 

Pii = °i > ^ii — 

Pij = rpopj ; A/j = rJijhjOjOj 

When the values of the variances and covariances have been entered, the solution 
of equations [19.11] provides the values of the weighting factors, b, to be used in 
the index in equation [19.10]. The construction of an index is illustrated later, in 
Example 19.5. The expected response to selection will be dealt with after the dif¬ 
ferent forms of the index equations have been explained. 


. . . [19.12] 


Economic value. Next consider the improvement of economic value. The economic 
value is the profit made from the sale of the individual. In practical breeding opera¬ 
tions it is often possible to assign economic values to individuals. This is then the 
phenotypic value of merit, which is the character to be improved, and the index 
is constructed for the improvement of this single character. But the index equations, 
whose solution gives the value of the b' s in the index, differ in one respect from 
what was described above. The economic values of individuals cannot be known 
at the time they are being considered for selection, and therefore cannot be included 
as a character in the index. The index equations are then as follows. In order to 
facilitate comparison, character 1 is still taken to be the character to be improved, 
in this case merit. 


^>2^22 + b^P 23 + ... 4- b m P 2m — A 2 1 

^2^32 + ^3^33 + ■ ■ • + ^m^3m = ^3! 


I ...[19.13] 


&lPml + . . - + b m P mm A m \ J 

The variances and covariances have to be estimated from past records of the economic 
values and the values of the characters in the index. 


Multiple traits. Finally, consider simultaneous selection for several characters. The 
objective is to improve the aggregate breeding value, or net merit , which is a par¬ 
ticular combination of all the characters to be improved. Merit is now defined as 

H = a x A x + a 2 A 2 + .... + a„A n • • • [19.14] 

Here the /fs are breeding values for the n characters to be improved, and the a's 
are weighting factors which express the relative importance attached by the breeder 
to each character. The weighting factors can be economic values; that is to say, each 
a is the value in money units of 1 unit of the character. This is how an index is 
constructed if the aim is to improve economic value when there are no records of 
individuals’ economic values, so that the index described above cannot be used. 
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Assigning money values to the characters is, however, not necessarily the best method 
of improvement. Other criteria for weighting are discussed by Fowler, Bichard, and 
Pease (1976) in connection with the improvement of pigs. If the weighting factors 
are not in money units, they must express in some other way the relative importance 
to the breeder of 1 unit increase of each character. Yamada, Yokouchi, and Nishida 
(1975) describe indices constructed in this way. 

The number of characters in the definition of merit (equation [19.14]) and in the 
index (equation [19.10]) may differ: there may be characters that are not in H but 
which may help to improve H through their correlations if included in /; and, con¬ 
versely, there may be characters in H which cannot be measured and so are not 
in I. It is important to note, however, that if the aim is to improve economic value, 
then all the characters that influence economic value must be included in the defini¬ 
tion of H. 

The index equations, whose solution gives the ^’s to be used in the index, are 
obtained in the same way as was described in Chapter 13, by maximizing r Hh the 
correlation between merit and the index. They are as follows: 


^ 1^11 +^ 2^*12 + — a l^ll + a 2^l2 + ••• +a n A\„ 

b\Pi\+b 2 Pzi+ ■■■ +b m P 2m = a ] A 2l +a 2 A 22 + ... +a„A 2n 


[19.15] 


b\Prn \ m2 mm ^1 ^ ^n^mn J 

The variances and covariances can again be expressed in terms of the heritabilities 
and correlations by equations [19.12]. Example 19.6 below illustrates the construc¬ 
tion of an index from these equations, simplified by considering only two characters. 


Response 

The index equations ([19.11], [19.13], [19.15]) are scaled in such a way that the 
regression of merit on index values is unity, e.g., b m = 1. The values of the 6’s 
in the index (equation [19.10]) are thus adjusted so that the metric values of the 
index I correspond numerically with the units in which merit H is expressed, whatever 
they are, when both are deviations from the mean. In this way the index becomes 
a prediction of breeding value for merit. With b Hl = 1, it follows that r HI = o I /o‘ H , 
as shown in Chapter 13. The expected response to selection is the same as in Chapter 
13, the predicted change in merit being given by 

Ph — i r iH^H [19.16] 

or, if the index has not been rescaled, 

R h = ioi ...[19.17] 

The variance of the index, from which <7/can be evaluated, is as in equation [13.12] 
when there is only one character in merit, i.e., for single-trait selection. Extended 
to include multiple-trait selection it is as follows. To simplify the notation, let L t aA 
be the sum of the terms on the right-hand side of the first equation of [19.15], L 2 aA 
that of the second equation, etc. Then the variance is 
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o] = b{L x aA + b 2 T> 2 aA + • • • + b^nfiA . .. [19.18] 

The standard deviation of the index, a h provides a simple way of comparing the 
relative efficiencies of different indices for improving merit because, as can be seen 
from equation [19.17], the response of merit is simply proportional to oj. 

The response of any one of the component characters of the index, or of merit, 
can be predicted as follows. Suppose we want to predict the response of character 
1. This is a correlated response of character 1 to selection for the index, and it is 
predicted by adaptation of equation [19.5a], putting character 1 in place of Y and 
the index / in place of X. The response of the index is the same as that of merit 
given in equation [19.17]. The correlated response of character 1 then becomes 

CR\ = b wll io, = C0V( ^' ; ia, = - cov M)1 , ... [19.19] 

Q[ a { 

Here cov^) t / is the additive genetic covariance of character 1 with the index, and 
it is obtained as follows. Multiplying equation [19.10] by A x gives the sum of pro¬ 
ducts of A x with / as b x P x A x + b 2 P 2 A x + -If each P is now written as (A+E), 

the products AE drop out because breeding values and environments are uncorrelated. 
The required covariance can now be seen to be as follows, where the variances and 
covariances are written in the notation of the index equations: 

cov M)1/ = b x A xx + b 2 A x2 + • ■ ■ + b m A Xm ... [19.20] 

These variances and covariances must be known for construction of the index, so 
substitution into equation [19.19] gives the predicted response. 

The following two examples will make clearer what is involved in constructing 
an index and will bring in one or two points of interest that have not been explained 
above. 

Example 19.5 The use of an index for the improvement of a single character will be 
illustrated from an experiment to be described in Example 19.6. Suppose the character 
to be improved is body weight in mice, and we consider using the correlated character 
tail length in an index. Let body weight be character 1 and tail length character 2. The 
parameters needed to construct the index are given in the table, with values taken from 
Rutledge, Eisen, and Legates (1973). 


Character 

h 2 

h 

a 2 P 

a P r A 

r p 

1 = Weight (g) 

0.36 

0.60 

6.37 

I'll 029 

0.45 

2 = Tail length (cm) 

0.44 

0.67 

0.28 

0.53 


P u = 6.37 

P 22 = 

0.28 

P 12 = P 2\ 

= 0.6010 


■A u = 2.2932 

^22 = 

0.1232 

A 12 = ^21 

= 0.1557 



The index equations for solution, from equations [19.11], are 

b x P xx + b 2 P 12 = A n 
b x P 2 \ + b 2 P 22 = a 2\ 

The values of the variances and covariances, calculated by equations [19.12], are given 
in the table. Substituting these in the above equations gives 
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6.376, + 0.6010 b 2 = 2.2932 
0.60106, + 0.28 b 2 = 0.1557 

and the solution is 

6, = 0.386; b 2 = -0.272 
The index for selection is therefore 

I = 0.3861^- 0.2727’ 

where W and T are the weight (g) and tail length (cm) respectively. The index can be 
rescaled for convenience by dividing all through by 6,, to give 

I' = W - 0.70 5T 

The index values are altered by the rescaling, but not the order of merit of the individuals. 

Note that tail length is given a negative weighting in an index for increasing body 
weight. In other words, tail length is an indicator of environment, rather than breeding 
value, for weight. The reason for this is that the environmental correlation, which is 
0.56, is much higher than the genetic correlation of 0.29. This illustrates a point not 
made previously, that a character may be useful in an index as an indicator of environ¬ 
mental deviations rather than of breeding values. 

The usefulness of a secondary character can be judged from its weighting coefficient 
b 2 . It can be shown that b 2 = 0 if the genetic and phenotypic regressions of character 
2 on character 1 are the same; or, in terms of correlations, if r A /r P — h x lh 2 . Under these 
conditions a secondary character will give no benefit; in fact, errors of estimation will 
make it worse than useless (Sales and Hill, 1976). 

To predict the response to selection, we have to calculate the variance of the index. 
For this purpose the unsealed index must be used. The variance, by equation [19.18], is 

a] = (0.386 x 2.2932) + (-0.272 X 0.1557) = 0.8428 

and <7/ = 0.918. The response could then be predicted by equation [19.17] if the inten¬ 
sity of selection were known. In order to see how useful the secondary character would 
be, we can compare the expected responses to index selection and to simple selection, 
assuming the intensity of selection is the same. The response to simple selection, by 
equation [11.3], is ih 2 o P . The ratio of the responses, index selection/simple selection, 
is therefore 0.918/0.907 = 1.012. The index would be only 1 per cent better than selec¬ 
tion for body weight alone. 

Finally, what would be the expected change of tail length resulting from selection for 
the index? The correlated response of character 1 is given by equations [19.19] and [19.20], 
but in this case we want the response of character 2. First, by equation [19.20], 

cov (A)2i = b 2 A 22 + 6,^21 — -0.0335 + 0.0601 = +0.0266 

Then, by equation [19.19], the expected response of tail length is 

CR 2 = +1(0.0266/0.9185) = +0.03/ cm per generation. 

A very small increase of tail length is expected. It might seem at first that the negative 
weighting of tail length in the index should result in a decrease. But the correlated response 
depends on the genetic correlation, which is positive. 

Example 19.6 The experiment with mice, from which the data for Example 19.5 were 
taken, applied index selection with the object of changing both body weight and tail length, 
and compared the observed with the expected responses (Rutledge, Eisen, and Legates, 
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1973). The objective was to change the body conformation by increasing one character 
and decreasing the other. Four lines were selected for seven generations, two selected 
for increased body weight and decreased tail length, and two selected in the opposite 
direction. The construction of the index for increasing body weight and decreasing tail 
length will be explained. The parameters needed are given in the table of Example 19.5. 
In addition we need the ‘economic’ weighting of the two characters. Equal ‘economic’ 
values were assigned to one standard deviation of change of each character. The weights, 
a, assigned were therefore the reciprocals of the phenotypic standard deviations, and 
these were a , = 0.40 and a 2 = -1.89. 

The index equations, from equation [19.15], are 

b\P n + b 2 P\2 = ^l-dll + a 2 A X2 
b\P2 ] "I" b 2 P 22 — Cly 4 2 , "I - ^2-^22 

Substituting the variances and covariances, and the weights, leads to 

6.376, + 0.6016 2 = 0.9173 - 0.2943 = 0.6230 
0.6016, + 0.286 2 = 0.0623 - 0.2328 = -0.1706 

and the solution is 


6, = 0.195; b 2 = -1.027 


The index for selection is thus 


/ - 0.195 W - 1.0277’ 

where W and T are an individual’s body weight and tail length respectively. For selec¬ 
tion in the opposite direction, to decrease W and increase T, the signs in the index are 
simply reversed. 

The variance of the index, by equation [19.18], is 

a 2 , = (0.195 x 0.6230) + (-1.027 X -0.1706) 

- 0.2967 
a, = 0.5447 

The intensity of selection realized, averaged over the four lines, was i = 1.01. The 
expected response based on the selection actually applied was, by equation [19.17], 

R h = 1.01 x 0.5447 = 0.55 index units per generation. 

The observed responses were 0.26 and 0.30 in the lines selected for increased W with 
decreased T, and 0.42 and 0.45 in the lines selected in the opposite direction. The reason 
for the observed responses being somewhat less than expected is probably that the 
parameters for construction of the index were not accurately estimated. 

(Some of the quantities calculated in this example differ a little from those given in 
the original paper. The reason for this is that the parameters used in the paper were derived 
from the base population before selection, but the published parameters used here are 
those of unselected lines maintained concurrently with the selection.) 


Effect of selection on genetic correlations 

There is one important consequence of multiple-trait selection to be discussed before 
we leave the subject. Just as the heritabilities are expected to change after selection 
has been applied for some time, so also are the genetic correlations. If selection 
has been applied to two characters simultaneously, the genetic correlation between 
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them is expected eventually to become negative, for the following reason. Those 
pleiotropic genes that affect both characters in the desired direction will be strongly 
acted on by selection and brought rapidly toward fixation. They will then contribute 
little to the variances or to the covariance of the two characters. The pleiotropic 
genes that affect one character favourably and the other adversely will, however, 
be much less strongly influenced by selection and will remain for longer at 
intermediate frequencies. Most of the remaining covariance of the two characters 
will therefore be due to these genes, and the resulting genetic correlation will be 
negative. The consequence of a negative genetic correlation, whether produced by 
selection in this way or present from the beginning, is that the two characters may 
each show a heritability that is far from zero, and yet when selection is applied to 
them simultaneously neither responds. We have already discussed, in Chapter 12, 
what is essentially the same situation resulting from the combined effects of artificial 
and natural selection: a selection limit is reached even though the character to which 
artificial selection is applied still shows a substantial amount of additive genetic 
variance. 

The theoretical expectation that selection should change the genetic correlation 
has been tested in several experiments but, as so often with genetic correlations, 
the evidence is conflicting. For a review and discussion of this question see Sheridan 
and Barker (1974). 

Problems 

19.1 The data below are taken from a sib analysis in a flock of broiler chickens. 
They refer to the weight gain ( G) from 5 to 9 weeks of individual males and the 
weight of food ( F ) consumed in the same period, both in units of grams. The figures 
given are the components of variance and of covariance between half-sib families, 
and the total phenotypic variances and covariance. From these data calculate the 
heritability of the two characters, and the phenotypic, genetic, and environmental 
correlations between them. 



Variance 


Covariance 





Weight gain 

Food consumption 

G with F 

Between sires 

1,602 

6,150 

2,229 

Total 

12,321 

61,504 

22,848 


Data based on Pym, R.A.E. & Nicholls, P.J. (1979) Brit. Poult. Sci ., 20, 73-86. 

[Solution 40] 

19.2 The data refer to two characters of Drosophila: body size measured as thorax 
length, and fertility measured as the number of eggs laid in 4 days. The phenotypic 
variances and the covariance were measured in a genetically variable population and 
in a genetically uniform group consisting of Ffs of crosses between inbred lines. 
From these data, estimate the three correlations — phenotypic, genetic, and 
environmental — in the variable population. How does the meaning of the genetic 
correlation here differ from that of the genetic correlation in Problem 19.1? 
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Population 

Variances 


Covariance 


Body size 

Fertility 


Variable 

0.366 

43.4 

0.87 

Uniform 

0.186 

16.6 

0.27 


Data from Robertson, F.W. (1957) J. Genet., 55, 428—43. 


[Solution 50] 

19.3 Consider again the data on broiler chickens in Problem 19.1. Suppose that 
selection for increased weight gain is to be applied in one line and for increased 
food consumption in another line. The proportions selected in both lines are to be 
10 per cent of males and 20 per cent of females. Calculate the predicted responses 
per generation of the two characters when directly selected and when responding 
as correlated characters. 


[Solution 60] 

19.4 Five generations of selection were applied to the broiler flock described in 
Problem 19.1. One line was selected for increased weight gain (G) and another was 
selected for increased food consumption (F). The total selection differentials applied 
over the five generations and the total responses were as follows, the responses being 
deviations from a control line. 



Line selected for 


G 

F 

Selection differential (g) 

574 

1,312 

Response of G (g) 

186 

120 

Response of F (g) 

412 

525 


Calculate the realized heritabilities of the two characters and the realized genetic 
correlation between them. 

Data from Pym, R.A.E. & Nicholls, P.J. (1979) Brit. Poult. Sci., 20, 73—86. 

[Solution 70] 

19.5 The litter size of mice could be increased by selection of females for their 
litter size, or by selection of both sexes for body weight. Which would be the better 
of these two simple procedures, given the following parameters? 


Heritability of litter size =0.22 

Heritability of body weight = 0.35 

Genetic correlation = 0.43 

Proportion selected: females =25% 
males = 10% 


[Solution 80] 
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19.6 In addition to the two procedures for increasing litter size considered in Pro¬ 
blem 19.5, a third procedure would be to select females for litter size and males 
for body weight. There are several ways in which this could be carried out. Assume 
the procedure to be as follows. Males are weighed at the appropriate age (e.g. six 
weeks) and 25 per cent are selected out of a large number. The selected males are 
each mated to a randomly chosen group of four young females whose litter sizes 
are not yet known. When the females have had their first litters the best one of the 
four mated to each male is selected. The litters of these selected males and females 
are reared as the next generation. How much better would this procedure be than 
selecting only on females for litter size? Take the heritabilities and genetic correla¬ 
tion to be as given in Problem 19.5. 

[Solution 90] 

19.7 The data refer to the broiler chickens described in Problem 19.1. Suppose 
that it is desired to improve 5—9 week growth (G), using the weight of food con¬ 
sumed (F) as an aid to the selection. Calculate the appropriate index for evaluating 
the parents to be selected. The estimates of the parameters needed are as follows. 
In order to have the decimal points conveniently placed for the calculations, the units 
of weight are changed here from grams to 100 g units. 


G F 


0.52 0.40 

1.11 2.48 

0.71 
0.83 


[Solution 100] 

19.8 If selection for growth were applied by the index calculated in Problem 19.7, 
what would be the predicted improvement per generation, assuming the intensity 
of selection was i = 1.5775 as in Problem 19.3? How much better would the index 
be for improving growth than selection for growth alone without the aid of a secon¬ 
dary character? 

[Solution 110] 

19.9 If selection for growth were applied by the index as in Problem 19.8, what 
would be the expected rate of change of the secondary character food consumption? 

[Solution 120] 


h 2 

0 P 

r P 


19.10 From the data and calculations in Problem 19.7 calculate an index for 
improving economic value, given that the value of growth is 8 cents per 100 g of 
weight gain, and that of food consumption is -2 cents per 100 g. 

Data from Pym, R.A.E. & James, J.W. (1979) Brit. Poult. Sci., 20, 99 107. 

[Solution 130] 
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19.11 Calculate the rate of improvement of economic value expected from selec¬ 
tion by the index in Problem 19.10, when 10 per cent of males and 20 per cent of 
females are selected as in Problem 19.3. How much better would the index be for 
improving economic value than selection for growth alone? 

[Solution 140] 



7fl METRIC CHARACTERS UNDER NATURAL 
SELECTION 


Throughout the discussion of the genetic properties of metric characters, which has 
occupied the major part of the book, very little attention has been given to the effects 
of natural selection, and something must now be done to remedy this omission. The 
absence of differential viability and fertility was specified as a condition in the 
theoretical development of the subject: that is to say, natural selection was assumed 
to be absent. Though for many purposes this assumption may lead to no serious 
error, a complete understanding of metric characters will not be reached until the 
effects of natural selection can be brought into the picture. The operation of natural 
selection on metric characters has, however, a much wider interest than just as a 
complication that may disturb the simple theoretical picture and the predictions based 
on it. It is to natural selection that we must look for an explanation of the genetic 
properties of metric characters which hitherto we have accepted with little comment. 
The genetic properties of a population are the product of natural selection in the 
past, together with mutation and random drift. It is by these processes that we must 
account for the existence of genetic variability; and it is chiefly by natural selection 
that we must account for the fact that characters differ in their genetic properties, 
some having proportionately more additive variance than others, some showing in- 
breeding depression while others do not. These, however, are very wide problems 
which are still far from solution, and in this concluding chapter we can do little more 
than indicate their nature. Before considering the ways in which natural selection 
affects metric characters, we shall give a brief account of natural selection itself 
and what it means. 

Natural selection 

Fitness and its components 

The ‘character’ that natural selection selects for is fitness. The fitness of an individual 
is the contribution of genes that it makes to the next generation, or the number of 
its progeny represented in the next generation. Relative fitness is the fitness of an 
individual relative to the population mean, i.e., W / W , if W is the individual’s fitness. 
If a population is neither expanding nor contracting in numbers, the mean fitness 
of its individuals is 1 and then absolute fitness and relative fitness are the same. 
There are difficulties in defining fitness precisely. One such difficulty lies in separating 
the fitness of an individual from that of its parents. In a mammal, for example, the 
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survival of the juvenile progeny depends partly on their viability, which is an aspect 
of their own fitness, and partly on the parental care that they received, which is 
an aspect of the parents’ fitness. This overlap of fitness from one generation to the 
next means that there is no precise time in an individual’s life at which we can say 
that its attributes reflect its own fitness rather than that of its parents. The point of 
separation between the generations must therefore be a more or less arbitrary choice. 

The mean fitness of a population is a concept that has to be used with great care. 
It was said above that the individuals of a population have a mean fitness of 1 if 
the population is neither increasing nor decreasing in numbers. This seems simple 
enough. But whether a population increases or decreases or remains constant in 
numbers depends to a large extent on the environmental resources available to it. 
Natural selection between individuals within a population may change the genetic 
constitution of the population, but the mean fitness will not change if the population 
is already at the limit of the carrying capacity of its environment. When the mean 
fitness is referred to in what follows, it will be assumed that the population is not 
limited by environmental resources. 

The fitness of an individual is the final outcome of all its developmental and 
physiological processes. The differences between individuals in these processes are 
seen in variation of the measurable attributes which can be studied as metric 
characters. Thus the variation of each metric character reflects to a greater or lesser 
degree the variation of fitness; and the variation of fitness can theoretically be broken 
down into variation of metric characters. Consider, for example, a mammal such 
as the mouse. Figure 20.1 illustrates the hierarchy of characters contributing to the 
fitness of females. Nearly all of the characters shown have been studied genetically 
as metric characters. Fitness itself can be broken down into two major components, 
the total number of offspring produced and the quality of these offspring, which 

12 3 4 

Disease resistance 
Predator avoidance 

Ovulation rate 
Embryo survival 


Mammary-gland 

size 


Fitness 


Total number 
of offspring 
born 

(Fertility) 


Quality of 
offspring 
weaned 
(Maternal 
( performance) 


Viability 

Mating success 

Litter size 

Frequency 
of litters 
Number 
of litters 

1 Milk-yield 

Maternal 

behaviour 


Fig. 20.1. Some of the components of fitness of a mammal such as the mouse, to show the 
hierarchy of causes of variation. Variation of each of these metric characters is associated, to 
a greater or lesser degree, with variation of fitness. 
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might be measured as their weaning weights. The variation of the major components, 
if properly measured, would account for all the variation in fitness. The variation 
of the major components can in turn be attributed to other characters, some of which 
are shown in column 3 of the diagram. These again are influenced by others, a few 
of which are in column 4. The characters in column 4 are themselves influenced 
by many others. Among these, for example, are physiological functions such as the 
output of the various gonadotrophic hormones which influence ovulation rate, embryo 
survival, and milk-yield. There are, in addition, characters whose influences on fitness 
are less direct and less obvious, but which are correlated with some of the com¬ 
ponents of fitness. Body size, for example, is correlated with several, perhaps all, 
of the characters in column 4. The problem we have to examine is how all these 
metric characters are affected by selection for fitness, and how the action of natural 
selection is related to the character’s position in the hierarchy. 

Measurement of fitness. It is very difficult to measure fitness directly, particularly 
the fitness of individuals. It is less difficult to measure the major components of 
fitness separately. The overall fitness can then be estimated by combining the values 
of the components. A measure related to the mean fitness of a strain can be obtained 
by rearing it in competition with a tester strain which is genetically marked so that 
the offspring produced by crossing can be identified. The relative numbers of progeny 
produced by the two strains then provide a ‘competitive index’ of the strain under 
test. For details of the estimation of fitness in Drosophila, and references, see Mackay 
(1985). In industrialized societies of man, by far the most important component of 
fitness is the number of children reared, since mortality between childhood and the 
end of reproduction is very low. The size of completed families therefore provides 
a fairly good measure of the fitness of the two parents jointly though, of course, 
it includes infant and childhood survival in the parents’ fitness. An improved measure 
can be obtained by taking account of the rate of reproduction, i.e., the average age 
of the parents at the birth of their children (see Waller, 1971). 

Relationships between metric characters and fitness 

We are now in a position to discuss the ways in which natural selection affects 
characters of different sorts, as described above. We have to consider what sort of 
selection is being applied to the character, what this will do to the frequencies of 
the genes concerned, and how the genetic properties of the character are thereby 
influenced. We shall be concerned mainly with populations that are approximately 
in equilibrium, and so the properties of equilibrium populations will be described first. 

Equilibrium populations 

A population in equilibrium is one in which gene frequencies are not changing at 
any loci. Consequently the mean values of all metric characters remain constant and, 
despite continued natural selection, fitness does not increase. Or, to put it in another 
way, the population has reached a selection limit for fitness. Probably few real popula¬ 
tions are strictly in equilibrium at selection limits because the attainment of the limit 
would require the environment to have remained unchanged over a long time. Never¬ 
theless, most populations studied are probably near enough to equilibrium that we 
can infer from them what the genetic properties of an equilibrium population are. 



Relationships between metric characters and fitness 


339 


Abundant evidence proves that virtually all metric characters are genetically 
variable in populations that are more or less in equilibrium, including characters 
that affect fitness. There must therefore be genetic variance of fitness. But, since 
selection for fitness produces no response, there can be no additive genetic variance 
of fitness; so all the genetic variance of fitness must be non-additive, i.e., variance 
due to dominance and epistatic interactions. The array of gene frequencies in an 
equilibrium population is the best, in the circumstances, for maximizing fitness. 

Now, if selection is applied to any metric character that is not fitness itself, the 
gene frequencies at loci affecting the character must change if there is a response. 
Fitness must therefore be reduced as a correlated response, unless the character 
selected is controlled entirely by genes with no effects on fitness. This expectation 
is amply born out by experience: experimental selection for metric characters almost 
always results in a reduction of one or more of the major components of fitness. 
To give just one example: the mean fitness of Drosophila was estimated as a com¬ 
petitive index after five generations of selection for abdominal bristle number (Latter 
and Robertson, 1962). There were two lines selected upwards and two downwards. 
The mean fitness, relative to an unselected control, was 79 per cent in the upward 
selected lines and 65 per cent in the downward selected lines. 

If artificial selection is carried out and is then suspended before much of the varia¬ 
tion has been lost by fixation, natural selection must tend to bring the gene frequen¬ 
cies back toward their equilibrium values, and the mean of the character artificially 
selected is expected to revert toward its original value. This tendency for natural 
selection to resist changes of gene frequency is known as genetic homeostasis (Lemer, 
1954). Its effect can often be seen in experimental selection when the weighted selec¬ 
tion differential is less than the unweighted (see Example 11.5). 

If the environment to which an equilibrium population is adapted changes, the 
array of gene frequencies is no longer optimal. The changed environmental cir¬ 
cumstances alter the relative weighting of the components of fitness, so that fitness 
now has some additive variance and can respond to natural selection. The applica¬ 
tion of artificial selection can be thought of as changed environmental circumstances 
in this way, altering the weighting of the components in the combined natural and 
artificial fitness. The components of fitness in human populations have changed 
drastically in the recent past as a result of medicine and contraception, psychological 
and behavioural factors having largely replaced physiological factors as determinants 
of fitness. 

‘Fitness profiles’ 

The question about how a metric character is related to fitness can be put more pre¬ 
cisely as follows: if two individuals differ in their genotypic values of the character, 
how do they differ in fitness? Suppose we could measure the fitness and the genotypic 
value of each individual. If we now plot fitness against the metric character we should 
get what may be called a ‘fitness profile’. Figure 20.2, which is based on the ideas 
of A. Robertson (1955), illustrates three such profiles, representing characters hav¬ 
ing different relationships with fitness. These will first be described briefly and then 
considered in turn more fully. 

The broken diagonal line is the profile that would be obtained if the metric character 
measured was fitness itself. Curve (1) is the profile of a major component of fitness, 
such as the total number of offspring born. As this increases from its lowest values, 
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Fig. 20.2. ‘Fitness profiles’ as explained in the text. The scales on both axes are standard 
deviations from the means. 

fitness increases almost linearly and at a rate nearly equal to fitness itself. But at 
the upper end of the range there is a point above which a further increase of the 
character leads to a reduction of fitness. This bending down of the profile at the 
upper end results from interactions with other components. To take number born 
as an example: at the low end of the range each additional young born results in 
nearly one additional offspring reaching adulthood. But as the number born increases, 
their ‘quality’ is progressively reduced by limitations of maternal performance until, 
above a certain number, the reduced quality outweighs the extra numbers, and fewer 
offspring survive to adulthood. In profile (2) the fittest individuals are those with 
values of the character at or near the mean. Body size of many organisms is prob¬ 
ably of this sort. The reasons for characters having an intermediate optimum will 
be discussed later. Many characters must be expected to have profiles falling be¬ 
tween curves (1) and (2), with optima at some distance from the mean. Finally, pro¬ 
file (3) represents a character that is neutral, or nearly so, with respect to fitness. 
There are almost no differences of fitness among individuals with different values 
of the character. The number of abdominal bristles in Drosophila is a character of 
this sort. The evidence for characters being neutral will be given later. 

In earlier chapters characters have been referred to as being ‘closely connected’ 
with fitness, but the precise meaning of the ‘close connection’ was not explained. 


/ 

/ 

/ 
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It can now be seen that characters closely connected to fitness are the major com¬ 
ponents, with fitness profiles like curve (1). The closeness of the connection falls 
off as the profile approaches that of curve (2). The generalization was made in earlier 
chapters that characters closely connected with fitness tend to have low heritabilities 
and to be severely affected by inbreeding depression. (The extensive data surveyed 
by Roff and Mousseau (1987) and Mousseau and Roff (1987) confirm the generaliza¬ 
tion about heritabilities.) The reasons for these genetic properties can be understood 
from the effect of natural selection on these characters, as follows. 

Major components 

The essential feature of characters with fitness profiles like curve (1) is that the popula¬ 
tion mean of the character is below its optimal value for fitness: individuals above 
the mean for the character are of above-average fitness. Natural selection favours 
these individuals but, despite the correlated selection differential on the character, 
the mean of the character is unchanged. One might say that natural selection is try¬ 
ing to increase the character but cannot do so. This situation could arise from two 
genetic causes: either from genes at more or less intermediate frequencies that are 
overdominant with respect to fitness, or from deleterious recessives maintained at 
low frequencies by mutation balancing the selection. In either case, the variance 
of the character would be mainly non-additive, and dominance would be directional; 
the character would consequently have a low heritability and be subject to inbreeding 
depression. Genes at intermediate frequencies cause more variation than genes at 
low frequencies, so it is possible that most of the variance comes from overdominant 
loci and most of the inbreeding depression from rare recessives (Crow, 1952). The 
expectation of low heritabilities and directional dominance was confirmed by a com¬ 
parison of 12 characters in Drosophila (Kearsey and Kojima, 1967). All the measures 
of major components of fitness showed epistatic interaction and strong directional 
dominance, while all the others — measures of body size and bristle number — 
showed little or no dominance or interaction. In another experiment with Drosophila 
(Mackay, 1985) fitness profiles of viability and fertility were obtained, and they 
closely resembled curve (1) of Fig. 20.2. These characters also showed a large amount 
of inbreeding depression. The results overall were consistent with the variation of 
fitness being due to recessives at low frequencies rather than to overdominant genes 
at intermediate frequencies. 

Fitness may be thought of as an index by which natural selection selects 
simultaneously for all the major components. We should then expect additive genetic 
correlations between characters that are major components of fitness to be negative, 
for the reasons given at the end of the previous chapter. But deleterious genes would 
be likely to have deleterious effects on more than one component of fitness, and 
so to contribute positively to the genetic correlation. High positive correlations were 
in fact found in the experiment cited above (Mackay, 1985), though these were in 
a laboratory population, not a wild one. 

Characters with intermediate optimum 

Characters with a fitness profile like curve (2) in Fig. 20.2 are said to have an inter¬ 
mediate optimum because individuals with values of the character near the popula¬ 
tion mean have the highest fitness, and selection appears to favour intermediates. 
Selection that appears to favour intermediate values is known as stabilizing selec¬ 
tion. There is, however, an ambiguity in the way the terms ‘intermediate optimum’ 
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and ‘stabilizing selection’ are used. They may be used purely as a description of 
the fitness profile in which intermediates have the highest fitness. Or they may be 
used in an operative or functional sense, meaning that the character is a criterion 
of selection, intermediates being favoured because they have that value of the 
character. When artificial stabilizing selection is applied, the selection is functional 
because the character is a criterion of selection. But with natural selection, the shape 
of the fitness profile is not enough to tell us that functional stabilizing selection is 
operating. The reasons for this will be clearer when we consider examples of 
characters with intermediate optima. 

The evidence from which stabilizing natural selection is usually inferred rests on 
showing that the phenotypic variance is reduced by the selection. This is done by 
comparing the variances before and after selection, or among the survivors and the 
dead. (For details of how selection acting on a metric character can be measured 
see Lande and Arnold, 1983.) Sometimes more direct evidence can be obtained by 
showing that intermediates have higher values of one of the major components of 
fitness, such as survival. To prove that intermediates have the highest fitness, 
however, is not enough to tell us how natural selection acts on the character and 
on the genes that affect the character. For this it is necessary to know why inter¬ 
mediates are fittest, and the answer to this question is not always obvious. The effect 
of the selection depends on whether the value of the character itself is a direct cause 
of fitness, or whether the connection with fitness is through pleiotropic effects of 
the genes. The problem will be best explained by considering some examples, the 
first being a direct causal effect on fitness and the last having no causal effect. 

1. A straightforward example of a character having a direct causal effect on fitness 
would be any measure of the thermal insulation of a mammalian coat. The conflict¬ 
ing needs for conserving heat during inactivity and dissipating heat during activity 
are balanced at an intermediate coat density. Intermediates are favoured because 
that value of the character is best. The selection is functional or true stabilizing selec¬ 
tion. A change of the mean in either direction would reduce the mean fitness, and 
the only way by which the mean fitness could be increased is by reducing the variance. 
(This example, it must be said, is conjectural: it might be found that over the range 
of variation in any real population the differences in fitness associated with the 
character were fairly small.) 

2. Clutch size in birds is a well-known example of an intermediate optimum. The 
number of eggs laid (clutch size) is a direct cause of fitness, and if no other factor 
were involved the birds laying the largest clutches would be fittest. But the number 
of offspring that can be reared is limited by the available food supply, and an inter¬ 
mediate clutch size results in the largest number reared. The mean clutch size has 
been proved to be at or near optimum in several species (Lack, 1966). The inter¬ 
mediate optimum might be said to be imposed by the environment or, alternatively, 
be said to result from a strong interaction between the two major components of 
fitness, number and quality of the young. The selection is again true stabilizing selec¬ 
tion, but it differs from the previous example in one respect: the mean fitness would 
be improved by an increase of dutch size if at the same time the birds’ skill in obtain¬ 
ing food could be increased. 

3. Next consider a character that itself has very little effect on fitness but which 
appears to have an intermediate optimum as a result of its correlations with other 
characters that do affect fitness. Body size of mice provides a plausible though not 
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fully proved example. Body size in females is positively correlated with the size 
of litter that they bear (Falconer, 1965!?), and so with the major component of fitness, 
number born. If this were the only factor, the optimum body size would be above 
the mean. But body size is negatively correlated with reactivity, or wildness. Large 
mice are placid and unreactive to disturbance, whereas small mice are alert and react 
vigorously to disturbance (MacArthur, 1949; Falconer, 1953). Under natural con¬ 
ditions one must suppose that larger mice would be less good at escaping predators 
than small ones. The intermediate optimum body size thus results from opposing 
correlations with different components of fitness, a positive correlation with number 
born and a negative correlation with length of life. The stabilizing selection in this 
case is spurious because the criterion of selection is not body size but the characters 
correlated with it. Body size in Drosophila is better understood. From the known 
correlations in D. melanogaster with the major components of fitness, the so-called 
‘life history’ components, it is possible to construct an expected fitness profile of 
body size. By combining the correlations with three components — fecundity, 
development time, and larval survival — an expected fitness profile like (2) in Fig. 
20.2 was obtained (Roff, 1981). Though natural selection was assumed not to act 
on body size itself, intermediates are expected to have the highest fitness. Further¬ 
more, the mean body sizes found in populations of D. melanogaster as well as of 
some other species all fall near the peak of the calculated profile. Thus the observed 
body sizes are well explained by the correlations with the three major components 
of fitness. The stabilizing selection is again spurious. 

4. Finally, an example that illustrates well the difficulties in interpreting an inter¬ 
mediate optimum is provided by sternopleural bristle number in Drosophila 
melanogaster (Kearsey and Barnes, 1970). The character has the clear appearance 
of being subject to stabilizing selection, but the appearent selection has nothing 
whatever to do with the character itself. The sternopleural bristles are small bristles 
on the sides of the thorax of the adult flies. The population studied was derived from 
a cross of two strains selected in opposite directions; it had a mean of 18.5 bristles 
in females, with a range of approximately 10 to 45. The evidence for stabilizing 
selection is given in Fig. 20.3(a), which shows the distributions of bristle number 
in flies grown in uncrowded conditions (solid line) and in crowded conditions (broken 
line). In crowded conditions the natural selection for larval survival is much stronger 
and its effect in eliminating flies with the more extreme bristle numbers is clearly 
seen. A fitness profile can be constructed from the reduction in frequency of flies 
with different bristle numbers. This profile, however, is not of fitness itself but of 
one major component, larval survival. Figure 20.3 (b) shows the profile, which has 
a sharp peak of fitness at the mean bristle number. The important point about this 
fitness profile, however, is that the selection which gives rise to the intermediate 
optimum takes place in the larval stages, before the flies have developed any bristles. 
The superior fitness of intermediates is therefore in no way caused by the character 
itself, but must result from pleiotropic effects of the genes on the larval characters 
that contribute to survival. The stabilizing selection is spurious, as in the previous 
example, but in this case we do not know what the real criteria of selection are, 
nor why adults with intermediate bristle numbers are fittest in the larval stages. 

Effects of stabilizing selection. The question to which the foregoing discussion has 
been leading is: How does stabilizing selection affect the genes that affect the character 
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Fig. 20.3. ( a ) Distributions of stemopleural bristle number in Drosophila : continuous line, 
under weak natural selection; broken line, under strong natural selection. 

(b) Fitness profile derived from the frequency distributions in (a). The fitness scale is relative to 
the fittest bristle class. The fitness measured is only the component, survival from egg to adult, 
and not the whole of fitness. (Adapted from Kearsey and Barnes, 1970.) 

subject to it? The answer to this question is too complex to give in detail here, but 
the main conclusions reached by theoretical study are the following (for details, see 
Robertson, 1956; Lewontin, 1964; Curnow, 1964; Bulmer, 1985, p. 150). First, 
it depends on whether the stabilizing selection is real or spurious. If it is spurious 
there are no means of knowing its effects without knowing what are the components 
of fitness being selected. One possibility is the following (Robertson, 1956). The 
genes affecting the major components of fitness may have pleiotropic effects on the 
character in question, and these effects on the character may be more or less additive. 
Individuals with intermediate values of the character must then be heterozygous at 
more loci than extreme individuals. If the loci have overdominant effects on fitness, 
then natural selection favours heterozygotes and in consequence appears to favour 
intermediates for the character. The consequence of selection acting in this way would 
be the maintenance of the genetic variation of the character. In the case of the sterno- 
pleural bristles described above, however, it was shown from the shape of the fitness 
profile that the apparent stabilizing selection could not be accounted for in this way 
(Kearsey and Barnes, 1970). 

Real stabilizing selection has two main effects. First, it favours genotypes with 
the least variability (Curnow, 1964). Therefore, unless the least variable genotypes 
are heterozygotes, it tends to fix genes that confer the greatest developmental stability, 
irrespective of whether they affect the mean of the character or not. Stabilizing selec¬ 
tion is thus expected to increase canalization of development (Waddington, 1957). 
Second, it tends to reduce the genetic variance of the character and this it does in 
two ways (Bulmer, 1971, 1976). The more immediate, and at first the largest, effect 
is through the creation of gametic phase disequilibrium. The selection causes allelic 
effects at different loci to be negatively correlated in individual genotypes. The 
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covariance term in equation [8.10] is therefore negative and the genetic variance 
is reduced. In so far as the genes are linked, the selection tends to build up ‘balanced’ 
combinations in which linked genes are in predominantly repulsion linkage, so that 
the effect of the chromosome as a whole is minimized (see Mather, 1941; Lewontin, 
1964). The selection, however, has to be very strong, or the linkage very close to 
keep the genes in combinations that are appreciably different from a random arrange¬ 
ment (see Wright, 1969, p. 92). The second way by which the genetic variance is 
reduced is through changes of gene frequencies at loci that affect the mean of the 
character. Provided the genotypes do not differ much in variability, and provided 
the loci do not affect fitness in any other way than through the character, then stabiliz¬ 
ing selection tends to change the gene frequencies toward fixation, and so to reduce 
the genetic variance (Robertson, 1956). However, the gene frequencies change only 
slowly, unless selection is intense or the genes have large effects. The consequences 
of real stabilizing selection are thus to reduce both environmental and genetic variance. 
Both these effects have been observed experimentally; for example, with stemopleural 
bristle number in Drosophila (Gibson and Bradley, 1974), and with pupa weight 
in Tribolium (Kaufman, Enfield, and Comstock, 1977). 

Disruptive selection is the opposite of stabilizing selection, intermediates being 
selected against. Its expected effects are the opposite of stabilizing selection, an 
increase of both genetic and environmental variance, and these effects have again 
been observed experimentally. For example, when applied to larval development¬ 
time in Drosophila (Prout, 1962), the environmental variance was increased; and 
when applied to pupa weight of Tribolium (Halliburton and Gall, 1981) the genetic 
and environmental variances were both increased. 

Neutral characters 

Evidence that a character is neutral with respect to fitness, having a profile like curve 
(3) in Fig.. 20.2, can be obtained from a ‘perturbation’ experiment. A population 
is subjected to a few generations of directional selection for the character; then, when 
the mean has been changed some way from its original value, selection is suspended 
and the population is allowed to breed at random, subject only to natural selection. 
If the mean does not revert to its original value, or does so only very slowly, it 
can be concluded that the character is neutral, or nearly so. Strictly speaking, reversed 
selection should also be applied, to prove that the mean can be brought back. This 
is the evidence for abdominal bristle number in Drosophila being cited earlier as 
an example of a neutral character (Latter and Robertson, 1962). If a character is 
proved to be neutral in this way, it does not mean that the genes affecting the character 
have no pleiotropic effects on fitness. It means only that, over the range covered 
by the perturbation, the character itself is not subject to natural selection. On the 
principle of genetic homeostasis, the gene frequencies must tend to revert if the per¬ 
turbation is large, and the mean of the character may then move some way back 
toward its original value as a correlated response. To say that a character is neutral 
with respect to fitness is not to say that what is measured has no function. The ab¬ 
dominal bristles of Drosophila doubtless have a function, perhaps an important one, 
in the life of the adult fly. All that their neutrality means is that the precise number 
is not important for the fitness of the fly. 
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Response of fitness to selection 

The foregoing section dealt with populations that are in equilibrium, or nearly so. 
Let us now consider populations that are not in equilibrium, in which fitness is not 
maximal, and in which there is some additive genetic variance of fitness. How fast 
will fitness increase under natural selection? This problem has an elegant solution, 
known as Fisher’s fundamental theorem , which states that the increase of fitness 
in one generation is equal to the additive genetic variance of fitness. This theorem 
has given rise to a great deal of discussion about its validity and its generality. A 
proof of the theorem and an explanation of why it has caused so much difficulty 
is given by Price (1972). The conclusion can be very simply demonstrated for popula¬ 
tions with non-overlapping generations, by consideration of the weighted selection 
differential, as follows. 

The response to selection for a metric character is predicted from the selection 
differential by equation [11.2] (R = h 2 S). The selection differential S is the 
weighted mean superiority of the selected parents, the weights being the relative 
contribution of progeny from which the response is evaluated. Fitness must be expres¬ 
sed as relative fitness. If k is the number of offspring of any particular individual 
and k is the mean number of offspring of all individuals in that generation, the relative 
fitness of the individual is W = k/k. The mean relative fitness is W = 1. The relative 
fitness is also the weight to be attached to the individual’s contribution, W — W, 
to the selection differential. Therefore, if N is the total number of individuals in 
the parental generation, the weighted selection differential on relative fitness is 

E W{W - W) 
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(The last step in the derivation was explained in connection with equation [3.4].) 
Thus the selection differential on fitness is equal to the phenotypic variance of fitness, 
and we may note incidentally that the intensity of selection is equal to the phenotypic 
standard deviation: 


iyy — S]ylO\\f ~~ (Tjy ... [ 20 . 2 ] 

Substituting equation [20.1] into equation [11.2] shows that the response of fitness 
to natural selection is equal to the additive genetic variance of fitness: 

R w = h\fV P(W ) = Va(W) ■ ■ ■ [20.3] 

Correlated responses. A question of more interest, perhaps, than the change of 
fitness itself is the change of a metric character resulting from natural selection. 
For example, it used to be thought that natural selection was tending to reduce human 
intelligence because children from larger families have lower IQs than those from 
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smaller families; in other words, there seemed to be a correlated selection differen¬ 
tial for reduced IQ. In this case, however, the supposition about the selection dif¬ 
ferential is false for two reasons: first, because parents with zero fitness have no 
children and so cannot appear in the data; and second, because it is based on the 
correlation between parents’ fitness and children’s IQ. When the parents’ IQ is com¬ 
pared with their own subsequent number of children as a measurement of fitness, 
the correlated selection differential is found to be slightly positive (Waller, 1971). 

If a correlated selection differential is observed, can anything be deduced about 
the change in the character from natural selection? The correlated selection differen¬ 
tial, S y, on a character Y is the weighted mean of Y, the weight being the in¬ 
dividual’s fitness. Thus 


_ E W{Y — Y) 
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N 

and by a derivation similar to that of equation [20.1] this leads to 

Sy = COVp^yjyj . . . [20.4] 

i.e., the correlated selection differential is equal to the phenotypic covariance of 
the character with fitness. The correlated response can now be obtained from equa¬ 
tion [19.8c], i.e., CRy/Sy = cov^/cov^. From this and equation [20.4], the cor¬ 
related response is found to be 

CRy = cov^y^) . . . [20.5a] 

i.e., the additive covariance of the character Y with fitness. Alternatively, the 
covariance can be written in terms of the genetic correlation and heritabilities. This 
gives 

CRy = r^hyh\yOyO\y . - . [20.5^?] 

where a denotes the phenotypic standard deviation. Thus, to predict a correlated 
response to natural selection it would be necessary to know the genetic correlation 
between the character and fitness, and the heritability of the character and of fitness. 
(Equation [20.5a] was derived by Robertson (1966) in connection with an analogous 
problem in dairy cattle, namely the correlated responses expected from the overall 
selection applied by farmers.) For a detailed treatment of correlated responses under 
natural selection see Crow and Nagylaki (1976). 

Origin of variation by mutation 

Both random drift and stabilizing selection tend to reduce genetic variation. This 
raises the question of whether mutation produces new variation at a rate sufficient 
to counteract these tendencies. Let us first consider the balance between mutation 
and random drift without any selection. This was described in Chapter 15 in con¬ 
nection with inbred strains and subline divergence. The rate of origin of new varia¬ 
tion can be estimated from inbred lines, by seeing how much variation accumulates 
when an inbred line is random-mated, or is selected for a metric character without 
further inbreeding. The new variation of abdominal and sternopleural bristle number 
in Drosophila has been estimated in this way by several experiments summarized 
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by Hill (1982 b). In both characters, the average amount of variation generated by 
mutation over one generation was about one-thousandth part of the genetic variation 
in the base population. In other words, it would take about 1,000 generations for 
mutation to restore the genetic variation to its original level. When expressed in 
the more convenient way as a proportion of the environmental variance, V E , the 
mutational variance, V m , was approximately 10~ 3 V E . (The reason why it is the same 
proportion of the environmental variance as it is of the genetic is that for these 
characters the heritability is about 50 per cent.) The mutational variances of other 
characters in other species are also roughly 10 -3 K E (Lynch, 1988). 

We saw in Chapter 15 that when the new mutational variance is balanced by loss 
from random drift the genetic variance present in a population of effective size N e 
is equal to 2 N e V m . The phenotypic variance is 2N e V m + V E and the heritability is 
therefore 


h 2 = W e V m 

2N e V m + V E 

This is the broad-sense heritability because V m includes all the genetic variance, not 
just the additive. If we substitute V m = 10 _3 K E the heritability becomes 

2 _ 0 . 002 N e 

1 + 0 . 002 ^ 

and this gives the following values for different population sizes: 

N e = 100 1,000 10,000 

h 2 = 0.17 0.67 0.95 

(see Lynch and Hill, 1986). Thus mutation in the absence of selection is able to 
maintain a large amount of genetic variation except in very small populations. The 
very high heritabilities of a few characters such as human finger ridge counts (see 
Chapter 10) suggests that these characters are subject only to neutral mutation with 
no selection acting on the genes that influence them. 

Now consider the balance between mutation producing new variation and stabilizing 
selection eliminating it. This is a very complicated matter and it cannot be discussed 
in detail here. The amount of genetic variance maintained depends on many 
parameters which are not known, or are known only very crudely. These are, among 
other things, the shape of the fitness profile and the intensity of selection, the number 
of loci affecting the character and the distribution of their effects, the number of 
alleles at each locus, and of course the population size and the rate of origin of new 
variation. When certain assumptions are made about the unknown parameters, theory 
shows that the amount of genetic variance found in populations can be maintained 
by the mutation—selection balance (see Lande, 1976, 1988). There are doubts, 
however, about the validity of some of the assumptions. They imply a per-locus 
mutation rate that is too high to be credible; and pleiotropy is not taken into account 
(the genes may be selected for their effects on characters other than the one under 
consideration). (See Turelli, 1984, 1988.) 

Mutation, selection, and random drift are the only forces that can change the amount 
of genetic variance in a random-mating population. Though there is doubt about 
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precisely how these balance each other and about the amount of genetic variance 
maintained, there can be no doubt that these forces do balance each other, because 
genetic variance is found to exist for almost every character that has been studied 
in wild or laboratory populations. 

The major components of fitness are not subject to stabilizing selection. Nearly 
all mutants affecting them are deleterious and are eliminated, or kept at low 
frequencies, by natural selection. It was noted earlier that the variance of fitness 
seems to be due mainly to recessives at low frequencies. The question then is whether 
their rate of origin by mutation is high enough to account for their presence. The 
rate of origin of mutants affecting viability in Drosophila, i.e. survival from egg 
to adult, has been estimated (Mukai et al, 1972). By the use of special marker stocks, 
second chromosomes were kept intact through 40 generations and mutations allowed 
to accumulate. The homozygous effects of these chromosomes on larval survival 
were tested at 10-generation intervals, from which the rate of origin of mutants on 
chromosome-2 was deduced. There were two classes of mutated genes: those that 
were lethal in homozygotes, and those that were mildly detrimental, with viabilities 
down to about 0.6 of normal. After 40 generations, 24 per cent of chromosomes 
carried one or more lethals, and the rate of origin was 0.6 per cent per chromosome 
per generation. The detrimentals were about 10 to 20 times as frequent, with a rate 
of origin of about 10 per cent per chromosome per generation. Scaling up by 5/2 
to represent the whole genome, this means that after one generation of mutation, 
about 25 per cent of gametes carry a new mildly detrimental gene. This is clearly 
enough to maintain the genetic variance in the face of selection against the detrimental 
genes. It may be noted that, whereas lethals were found to be almost completely 
recessive, mildly detrimental genes were not. This inverse relationship between 
dominance and severity of effect is a general feature of deleterious genes in 
Drosophila-, see the review by Simmons and Crow (1977). 

The genes causing quantitative variation 

Finally, some comment must be made on the nature of the genes that cause varia¬ 
tion of metric characters. There is no evidence that these are a special category of 
genes (Thoday, 1977). Rather, it seems that any sort of gene may affect metric 
characters, their effects being often secondary to their primary function in develop¬ 
ment or metabolism. Let us, however, look at this question a little more closely 
and see what evidence there is about it, of which the following is only a very limited 
selection. The question mainly concerns characters, not closely connected with fitness, 
whose variation is due to genes that are not unconditionally deleterious. 

One must, of course, be able to distinguish the allelic variants of a gene before 
one can look to see if it affects a metric character. The protein polymorphisms, iden¬ 
tified by electrophoresis or immunologically, provide the most easily accessible allelic 
variants. There are two ways of looking for their effects. The first is by comparing 
the mean of the character in the different genotypes of the locus. For example, 
Niemann-Sorensen and Robertson (1961) looked for effects of ten blood-group loci 
in cows by comparing body weights, milk-yields, and fat percentages in the milk 
of individuals with different alleles. Two significant effects were found, of which 
the clearest was an increase of fat percentage associated with one particular allele 
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of the B blood-group system. The magnitude of the effect was 0.23 phenotypic stan¬ 
dard deviations, and the segregation of this allele accounted for about 2.8 per cent 
of the genetic variance. As another example, allelic variants of the enzyme GPDH 
were found to affect the power output of the flight muscles of Drosophila (Barnes 
and Laurie-Ahlberg, 1986). The second method is by looking for changes in gene 
frequency at the protein locus resulting from artificial selection for the metric 
character. For example, at least three of eight enzyme loci were shown in this way 
to affect grain yield in maize (Stuber et al, 1980). These studies are subject to the 
fundamental difficulty of deciding whether the effect found is of the protein locus 
itself or of some other closely linked locus. Effects of enzyme loci on metric characters 
are by no means always found, and this is not surprising. The locus may affect some 
metric character but not one that was measured; allozymes distinguished by electro¬ 
phoresis do not necessarily differ in the specific activity of the enzyme produced, 
and without a difference in the amount of activity there is unlikely to be any effect 
on a metric character. 

If a locus affects a metric character the effect is most likely to be due to the amount 
of a gene-product. Regulatory genes, which control the amount of gene-product, 
may be a major source of the variation of metric characters (see McDonald and Ayala, 
1978; Hedrick and McDonald, 1980) but there is little concrete evidence. The amount 
of gene-product can also be affected by the number of copies of the structural gene, 
and here there is clear evidence of metric characters being affected by the copy 
number. For example, the resistance of an aphid to insecticides is affected by the 
amount of an esterase enzyme (Devonshire and Sawicki, 1979). Susceptible strains 
have one copy and resistant strains up to 64 copies of the gene, the amount of enzyme 
produced being directly proportional to the copy number. Another example is pro¬ 
vided by transgenic mice with extra copies of a growth hormone gene, rat or human. 
These grow faster and are larger than normal mice (see Palmiter et al, 1983). A 
third example, where the connection between the gene-product and the metric 
character is not so direct, is the effect of the ‘bobbed’ locus in Drosophila on the 
number of abdominal bristles (Frankham, Briscoe, and Nurthen, 1980; Frankham, 
1988). The ‘bobbed’ locus codes for ribosomal RNA and consists of multiple copies 
of a DNA sequence repeated in tandem. Mutations occur by unequal crossing over 
which increases or decreases the copy number. Alleles with a reduced copy number 
lead to a reduced number of bristles. This is thought to be because there are fewer 
ribosomes and the rate of protein synthesis is reduced. 

Dominance and enzyme activity 

We have seen several times that much of the variation of fitness and its major com¬ 
ponents is due to recessive deleterious genes at low frequencies, whereas the varia¬ 
tion of characters not closely related to fitness is mainly due to genes with little 
dominance and at intermediate frequencies. The evidence for this comes from the 
heritabilities or the amount of non-additive genetic variance and inbreeding depres¬ 
sion. Some insight into why deleterious genes are recessive has been gained from 
consideration of the enzyme kinetics of metabolic pathways (Kacser and Bums, 1981; 
and see also Hard, Dykhuizen, and Dean, 1985; Dean, Dykhuizen, and Hard, 1988). 
Many of the genes affecting metric characters must presumably be genes coding 
for enzymes, and the allelic differences causing the variation must presumably be 
differences in the activities of the enzymes produced. The conclusions about 
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dominance are therefore relevant to questions about the genetics of metric characters. 
In a metabolic pathway each enzyme mediates one of the sequential steps. We have 
to consider the flux through the pathway, which is equivalent to the amount of the 
end-product. And we have to suppose that it is the amount of end-product that affects 
the metric character under consideration. 

The enzyme activity of heterozygotes measured in extracts is mid-way, or nearly 
so, between the activities of the two homozygotes; but the flux in the pathway, and 
therefore the effect on the character, is not necessarily also intermediate. The reason 
for this is that the relationship between the flux and the activity of any particular 
enzyme in the pathway is non-linear. At low levels of enzyme activity a small increase 
of activity results in a large increase of flux, but at high levels of enzyme activity 
the same increase of activity results in only a small increase of flux. The conse¬ 
quence of this curvilinear relationship is that if two alleles produce enzymes with 
a large difference in activity then the allele producing the high-activity enzyme is 
dominant; but if they produce enzymes with only a small difference in activity there 
is little dominance because over a small range of activity the curve relating flux 
to enzyme activity is nearly linear. Now, if we assume that a high flux in any pathway 
is beneficial, which seems reasonable, it will follow that most mutant alleles will 
have a reduced enzyme activity, and this has been confirmed experimentally; alleles 
with a much reduced enzyme activity will be deleterious and will therefore be at 
low frequencies, and they will be recessive. On the other hand, alleles whose en¬ 
zyme activities differ only a little will have little effect on the flux or on fitness and 
will therefore be at intermediate frequencies, and they will show little dominance. 
We should therefore expect much of the variation of fitness to be caused by deleterious 
recessives at low frequencies and the variation of characters not closely connected 
with fitness to be due to genes with little dominance. These studies of enzyme systems 
show furthermore that dominance is a consequence of the organization of enzymes 
in metabolic pathways and needs no other explanation. 

Induced mutation 

Though perhaps not very relevant to the question under discussion here, it is worth 
noting that the variation of metric characters can be increased by mutagenic agents 
in the same way that the mutation rates of major genes are increased. For example, 
the mutational variance (VJV E ) of Drosophila’s abdominal bristles is increased 
three-fold over the spontaneous rate by 1,000R of X-rays (see Mackay, 1987). An 
enormously higher mutation rate has been found to result from transposable elements. 
(These are stretches of DNA which move themselves from one site to another in 
the chromosomes; the gene in or near which they insert is mutated.) The P-element 
of Drosophila causes mutation of major genes. In crosses where the transposition 
of the P-element was activated, the mutational variance of abdominal bristle number 
was found to be 150 times the spontaneous value (Mackay, 1987, 1988). The inser¬ 
tion of a transposable element seems likely to be more damaging to the gene than 
the mutations occurring spontaneously or induced by X-rays, many of which are 
single base changes. The mutations induced by P-elements may therefore have larger 
effects, and so cause more variation, than other mutations. 


The pursuit of the genes that affect metric characters has more significance than 
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just to satisfy our curiosity. The introduction of foreign genes by the methods of 
genetic manipulation, if this is to improve a metric character, obviously depends 
first on the identification of the genes that might yield a worth-while improvement. 
If there were a large number of genes all with small effects, this would be a fruitless 
search. But with rather few genes contributing much of the variance, die identifica¬ 
tion of genes with effects large enough to be useful does not seem impossible. 

Problems 

20.1 Problem 11.2 was concerned with evolutionary change in a species of Darwin’s 
Finch following selective survival with respect to bill depth. Prediction of the response 
to this natural selection was made from R = h 2 S [11.2] on the assumption that ‘the 
cause of the selective survival was the bill depth itself and not some other character 
correlated with it’. Show that the prediction would be valid if 

Ia = h Y_ 

r p h w 


where Pis bill depth and W is fitness, and the correlations are between bill depth 
and fitness. 


[Solution 2] 

20.2 A rough idea of the effect of natural selection on IQ score can be got from 
the following data on a sample of Whites in Minnesota. The data refer to the IQ 
scores of individuals and the size of those individuals’ completed fa mili es i.e. the 
number of their children. It has to be assumed that family size is an adequate measure 
of fitness. The means, standard deviations, and heritabilities were 



Mean 

s.d. 

h 2 

IQ score 

103 

15.4 

0.6 

Family size 

3.4 

2.3 

0.1-0.2 


The heritability of family size was not reliably estimated but was probably in the 
range indicated. The correlation between the two characters was +0.11. It has to 
be assumed that the genetic correlation was not different from the phenotypic cor¬ 
relation. On the basis of these data and assumptions, what is the predicted change 
of IQ score per generation? What would be the apparent, i.e. correlated, selection 
differential on IQ scores in this population? 

Data from Waller, J. H. (1971) Social BioL, 18, 122-36. 


[Solution 65] 

20.3 Make a ‘fitness profile’ like Fig. 20.2 for human birth weight from the follow¬ 
ing data. Records of all babies bom in Italy in 1974 were analysed. The data here 
refer to males born after a normal pregnancy of 9 months, of which there were 
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413,572. Birth weights were grouped in classes whose mid-points are given. The 
mean birth weight was 3.46 kg and the standard deviation was 0.51 kg. For each 
birthweight class the mortality rate per thousand in the first four weeks, including 
stillbirths, is given. For the purpose of making the fitness profile it has to be assumed 
that survival to the age of four weeks is equivalent to fitness. 


Birth weight (kg) 

Frequency (%) 

Mortality per 1,000 

1.3 

0.13 

612 

1.8 

0.34 

333 

2.3 

2.13 

94 

2.8 

15.95 

27 

3.3 

40.32 

15 

3.8 

30.73 

11 

4.3 

8.54 

12 

4.8 

1.56 

25 

5.55 

0.30 

69 


Data from Terrenato, L. et al. (1981) Ann. Hum. Genet., 45, 55-63, and Ulizzi, 
L. et al. (1981) Ann. Hum. Genet., 45, 207—12. 
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Appendix Table A Truncated normal distribution — large sample, p = proportion of popula¬ 
tion with values exceeding the truncation point T., x = deviation of T from the mean, in 
standard-deviation units, i = mean deviation of individuals with values exceeding T, in standard- 
deviation units from the population mean. For values of p greater than 50 per cent: take x 
and i tabulated for (1 - p); give x a negative sign; multiply i by (1 - p)lp, retaining the positive 
sign. Errors from linear interpolation of p are positive, the largest in both x and i being ap¬ 
proximately +0.001 when p>0.10 per cent. (Abridged from Falconer, 1965a). 


p% 

X 

i 

p% 

X 

i 

p% 

X 

i 

0.01 

3.719 

3.960 

0.75 

2.432 

2.761 

10 

1.282 

1.755 

0.02 

3.540 

3.790 

0.80 

2.409 

2.740 

11 

1.227 

1.709 

0.03 

3.432 

3.687 

0.85 

2.387 

2.720 

12 

1.175 

1.667 

0.04 

3.353 

3.613 

0.90 

2.366 

2.701 

13 

1.126 

1.627 

0.05 

3.291 

3.554 

0.95 

2.346 

2.683 

14 

1.080 

1.590 

0.06 

3.239 

3.507 

1.00 

2.326 

2.665 

15 

1.036 

1.554 

0.07 

3.195 

3.464 




16 

0.994 

1.521 

0.08 

3.156 

3.429 

1.0 

2.326 

2.665 

17 

0.954 

1.489 

0.09 

3.121 

3.397 

1.2 

2.257 

2.603 

18 

0.915 

1.458 

0.10 

3.090 

3.367 

1.4 

2.197 

2.549 

19 

0.878 

1.428 




1.6 

2.144 

2.502 

20 

0.842 

1.400 




1.8 

2.097 

2.459 

21 

0.806 

1.372 

0.10 

3.090 

3.367 

2.0 

2.054 

2.421 

22 

0.772 

1.346 

0.12 

3.036 

3.317 

2.2 

2.014 

2.386 

23 

0.739 

1.320 

0.14 

2.989 

3.273 

2.4 

1.977 

2.353 

24 

0.706 

1.295 

0.16 

2.948 

3.234 

2.6 

1.943 

2.323 

25 

0.674 

1.271 

0.18 

2.911 

3.201 

2.8 

1.911 

2.295 

26 

0.643 

1.248 

0.20 

2.878 

3.170 

3.0 

1.881 

2.268 

27 

0.613 

1.225 

0.22 

2.848 

3.142 

3.2 

1.852 

2.243 

28 

0.583 

1.202 

0.24 

2.820 

3.117 

3.4 

1.825 

2.219 

29 

0.553 

1.180 

0.26 

2.794 

3.093 

3.6 

1.799 

2.197 

30 

0.524 

1.159 

0.28 

2.770 

3.070 

3.8 

1.774 

2.175 

31 

0.496 

1.138 

0.30 

2.748 

3.050 

4.0 

1.751 

2.154 

32 

0.468 

1.118 

0.32 

2.727 

3.030 

4.2 

1.728 

2.135 

33 

0.440 

1.097 

0.34 

2.706 

3.012 

4.4 

1.706 

2.116 

34 

0.412 

1.078 

0.36 

2.687 

2.994 

4.6 

1.685 

2.097 

35 

0.385 

1.058 

0.38 

2.669 

2.978 

4.8 

1.665 

2.080 

36 

0.358 

1.039 

0.40 

2.652 

2.962 

5.0 

1.645 

2.063 

37 

0.332 

1.020 

0.42 

2.636 

2.947 




38 

0.305 

1.002 

0.44 

2.620 

2.932 




39 

0.279 

0.984 

0.46 

2.605 

2.918 

5.0 

1.645 

2.063 

40 

0.253 

0.966 

0.48 

2.590 

2.905 

5.5 

1.598 

2.023 

41 

0.228 

0.948 

0.50 

2.576 

2.892 

6.0 

1.555 

1.985 

42 

0.202 

0.931 




6.5 

1.514 

1.951 

43 

0.176 

0.913 




7.0 

1.476 

1.918 

44 

0.151 

0.896 

0.50 

2.576 

2.892 

7.5 

1.440 

1.887 

45 

0.126 

0.880 

0.55 

2.543 

2.862 

8.0 

1.405 

1.858 

46 

0.100 

0.863 

0.60 

2.512 

2.834 

8.5 

1.372 

1.831 

47 

0.075 

0.846 

0.65 

2.484 

2.808 

9.0 

1.341 

1.804 

48 

0.050 

0.830 

0.70 

2.457 

2.784 

9.5 

1.311 

1.779 

49 

0.025 

0.814 

0.75 

2.432 

2.761 

10.0 

1.282 

1.755 

50 

0.000 

0.798 
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Appendix Table B Truncated normal distribution — small sample. The tabulated values 
are the intensity of selection, i, when n individuals are selected from a total of TV. Errors 
from linear interpolation of TV are negative, the largest being approximately -0.0075; inter¬ 
polation of n gives positive errors, maximum about +0.006. (Abridged from Becker, 1984, 
where much more extensive tables may be found.) 


TV 


1 

2 

3 

4 

5 

6 

7 

8 
9 

10 


1 

2 

3 

4 

5 

6 

7 

8 
10 
15 
20 


1 

2 

3 

4 

5 

6 
8 

10 

15 

20 

25 


2 

3 

4 

5 

6 

7 

8 

10 

12 

0.564 

0.846 

1.029 

1.163 

1.267 

1.352 

1.424 

1.539 

1.629 

— 

0.423 

0.663 

0.829 

0.954 

1.055 

1.138 

1.270 

1.372 

— 

— 

0.343 

0.553 

0.704 

0.821 

0.916 

1.065 

1.179 

— 

— 

— 

0.291 

0.477 

0.616 

0.725 

0.893 

1.019 

— 

— 

— 

— 

0.253 

0.422 

0.550 

0.739 

0.877 


— 

— 

— 

— 

0.225 

0.379 

0.595 

0.748 


— 

— 

— 

— 

— 

0.203 

0.457 

0.627 


— 

— 

— 

— 

— 

— 

0.318 

0.509 



— 

— 

— 

— 

— 

0.171 

0.393 



' ' 

— 

— 

— 

— 

— 

0.274 


1.703 

1.456 

1.271 

1.119 

0.986 

0.866 

0.755 

0.650 

0.447 


1.766 

1.525 

1.347 

1.201 

1.075 

0.962 

0.858 

0.760 

0.577 

0.118 


1.820 

1.585 

1.412 

1.271 

1.150 

1.042 

0.943 

0.851 

0.681 

0.282 


1.867 

1.638 

1.469 

1.332 

1.214 

1.110 

1.016 

0.928 

0.767 

0.405 


1.965 

1.745 

1.584 

1.455 

1.345 

1.248 

1.161 

1.081 

0.936 

0.624 

0.336 


2.043 

1.829 

1.674 

1.550 

1.446 

1.354 

1.271 

1.196 

1.061 

0.777 

0.530 


2.161 

1.957 

1.810 

1.694 

1.596 

1.510 

1.434 

1.365 

1.242 

0.991 

0.782 


2.249 

2.052 

1.911 

1.799 

1.705 

1.624 

1.552 

1.487 

1.372 

1.139 

0.951 


2.319 

2.127 

1.990 

1.882 

1.792 

1.713 

1.644 

1.582 

1.472 

1.252 

1.076 


2.377 

2.189 

2.055 

1.950 

1.862 

1.786 

1.659 

1.553 

1.342 

1.175 

1.032 


2.427 

2.242 

2.111 

2.008 

1.922 

1.848 

1.724 

1.621 

1.417 

1.257 

1.121 


2.508 

2.328 

2.201 

2.101 

2.018 

1.947 

1.828 

1.730 

1.536 

1.386 

1.259 


2.649 

2.478 

2.357 

2.263 

2.185 

2.118 

2.007 

1.916 

1.738 

1.601 

1.488 


2.746 

2.580 

2.463 

2.372 

2.297 

2.233 

2.127 

2.040 

1.871 

1.742 

1.636 


2.819 

2.657 

2.543 

2.455 

2.382 

2.320 

2.217 

2.132 

1.970 

1.846 

1.745 


2.878 

2.718 

2.607 

2.520 

2.449 

2.388 

2.288 

2.206 

2.048 

1.928 

1.830 


2.927 

2.769 

2.660 

2.574 

2.504 

2.445 

2.346 

2.266 

2.112 

1.995 

1.900 


968 

813 

.705 

.621 

552 

493 

396 

317 

166 

051 

958 


1 

2 

3 

4 

5 

6 

7 

8 
9 

10 




N 










n 

14 

16 

18 

20 

25 

30 

40 

50 

60 

n 


1 

2 

3 

4 

5 

6 

7 

8 

10 

15 



N 









— 

n 

70 

80 

100 

150 

200 

250 

300 

350 

400 

n 


1 

2 

3 

4 

5 

6 
8 

10 

15 

20 

25 



GLOSSARY OF SYMBOLS 


Numbers in square brackets refer to Chapters where the meaning applies. 

Some meanings with restricted use are not listed. 

A,, A 2 Alleles at a locus under consideration. 

A Breeding value. 

a Genotypic value of homozygote A,A, t as deviation from the mid-homozygote 

value. 

B or b As subscript, indicates between families or groups. 
b Regression coefficient; e.g. b 0P = regression of offspring on parent. 

CR Correlated response to selection. 

D Dominance deviation. 

d Genotypic value of heterozygote as deviation from the mid-homozygote value. 

E Environmental deviation. 

e 1 = 1 - h 2 . 

E c Common environment; i.e. environmental deviation of family mean from 
population mean. 

Eg Environment due to permanent, or general, effects. 

E s Environment due to temporary, or special, effects. 

E w Within-family environment; i.e., environmental deviation of individual from 
family-mean. 

F Coefficient of inbreeding. 

F, First generation of cross between lines or populations. 

F 2 Second generation of cross, by random mating among F,. 

FS Full sibs. 

/ Coancestry = coefficient of kinship. 

/ Subscript referring to females. 

/ [13] Subscript meaning between families. 


G Genotypic value. 

GCA General combining ability. 
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H Frequency of heterozygous genotype, AjA 2 . 

H [14] Amount of heterosis; i.e., deviation of cross mean from mid-parent value. 

HS Half sibs. 

h 2 Heritability (‘narrow sense’). 

/ Interaction deviation, due to epistasis. 

/ [13, 19] Index for selection. 

i Intensity of selection; i.e., selection differential in units of phenotypic standard 

deviation. 

k Numbers in various contexts. In [4, 10, 20] family size, i.e., number in 

family. 

I [2] Load. 

L [4, 11] Generation length. 

M Population mean. 

m [18] Population mean. 

m Subscript referring to males, 

m [10] Correlation between breeding values of mates. 

N Population size; i.e., number of breeding individuals in a population or line. 

N [10. 13] Number of families. 

N e Effective population size. 

n Numbers in various contexts. In [10, 13] specifically number of offspring per 

family. 

0 Offspring. 

P Parent. 

P Mid-parent. 

P Frequency of homozygous genotype AjAj. 

P Panmictic index, = 1 - F. 

P Phenotypic value. 

p Gene frequency of A t , the allele that increases the character. 

p [11, 18] Proportion selected, or exceeding point of truncation of a normal 

distribution. 

pg The pygmy gene of mice, used in several examples. 

Q Frequency of homozygous genotype A 2 A 2 . 

q Gene frequency of A 2 , the allele that reduces the character. 

R Response to selection — specifically to individual selection. 

R T [12] Total range; i.e., difference in mean between two populations at opposite 
selection limits. 

r [8] Repeatability; i.e., correlation between repeated measurements of the same 

individual. 

r Coefficient of relationship; i.e., correlation of breeding values between 

related individuals. 

r [10] Phenotypic correlation between mates. 

r [19] Correlations between two characters: r A = correlation of breeding values, 

r E = environmental correlation, r P = phenotypic correlation. 

S Selection differential in actual units of measurements. 
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Specific combining ability. 

Coefficient of selection against a specified genotype. 
[13] Subscript referring to sib-selection. 


Glossary of symbols 


SCA 

s 

s 

T 

t 

t 

u 

u 

V 

V m 

v 

W or w 
W 

X 

X 

x 


Y 

y 

z 

a 

a,, a 2 
A 
E 
a 


Total in various contexts. 

Time in number of generations. As subscript it means ‘at generation t'\ 
Phenotypic correlation (intraclass) between members of families. 

Mutation rate (from A, to A 2 ). 

[9, 15] Coefficient of the dominance variance in the covariance of relatives. 

Variance (causal component) of the value of deviation indicated by a subscript: 
V P = phenotypic, V G = genotypic, V A = additive genetic, V D = dominance, 
Vj = Interaction (epistatic), V NA = non-additive genetic, V E = environmental. 
Variance generated by mutation in one generation. 

Mutation rate (from A 2 to A,). 

As subscript, indicates within families or groups. 

[20] Fitness under natural selection. 

Subscript denoting any particular individual, e.g. [5] F x = inbreeding coef¬ 
ficient of individual X. 

One of two correlated characters. 

[11, 18] The normal deviate; i.e., deviation, in standard-deviation units, of 
point of truncation from population mean. 

The other of two correlated characters. 

Difference in gene frequency between two lines. 

Height of the ordinate of a normal distribution, in standard-deviation units. 

Average effect of a gene substitution. 

Average effects of alleles A { and A 2 respectively. 

Change of, as Aq = change of gene frequency, A F = rate of inbreeding. 
Summation of the quantity following the sign. 

Standard deviation ( a 2 = variance) of the quantity indicated by subscript. 


Equivalence of symbols used by Mather and Jinks 
as defined in Mather and Jinks (1977, p. 219) 

Mather 

and 

Jinks This book 

d a 

[d] La 

D La 2 = 2V A when all p = q = j (Equation [8.7]). 

D r 2V a in random-breeding population. 

K v Ew 

E b V Ec 

Ei V Ew 

1 

Ei V Ec H V Ew 

n 

h d 



Cl, 
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[h] zd 

H Id 2 = 4 V d when all p = q = (equation [8.7]). 
Hr 4V d in random-breeding population. 



SOLUTIONS OF PROBLEMS 


[Numbers in square brackets refer to the numbered equations in the text.] 


1 ( 1 . 1 ) 

(1) The total number counted is 6,129. With this large number the 
frequencies need 5 decimal places to avoid rounding errors. The genotype frequen¬ 
cies are 

MM: P = 1,787/6,129 = 0.29156 
MN: H = 3,039/6,129 = 0.49584 
NN: Q = 1,303/6,129 = 0.21260 

Check that these frequencies add to 1. 

(2) Putting the numbers for P, H, and Q into [1.1] gives the gene 
frequencies as 

M: p = [1,787 + (} x 3,039)]/6,129 = 0.53948 
N: q = [1,303 + (* X 3,039)]/6,129 = 0.46052 

Check that p + q = 1. 

(3) By [1.2] the expected genotype frequencies are 

MM: (0.53948) 2 = 0.29104 

MN: 2 x 0.53948 x 0.46052 = 0.49688 

NN: (0.46052) 2 = 0.21208 

Check that the expected frequencies add to 1. 

(4) Very close agreement. To test, we must convert the expected frequencies 
to expected numbers for comparison with the observed numbers. Multiplying the 
frequencies in (3) by the total number gives the expected numbers as 

M MN N 

1,783.8 3,045.4 1,299.8 

Check that the numbers add to 6,129. x 2 is calculated as E[(Obs. - Exp.) 2 /Exp.], 
from which x 2 = 0.027. This very low value confirms the close agreement. This 
X 2 has one degree of freedom because the observed numbers were used to estimate 
the gene frequency, and the expected numbers must be made to fit this as well as 
the total; in other words, there are three numbers with two constraints, so one degree 
of freedom is left. 
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2 (20.1) The observed selection differential on bill depth (T) was a correlated 
selection differential, S', following selection for fitness (W). The prediction made 
was R = h S'. The response, however, is a correlated response and should be 
predicted from S' by [19.86]: 

CRy = - h yyh yS ' 

r P 

Therefore CRy would be correctly predicted by R if 
" hwhy = hy 

rp 
i.e. if 

[a_ = h Y 

r P h w 

3 (3.1) If drawn from a random-breeding population the genotypes would 
be in Hardy-Weinberg proportions. These, calculated as in Problem 1.1, are 

aa ab bb Total 

101.0 821.0 1,669.0 2,591 

The observed numbers show an excess of both homozygotes and a corresponding 
deficiency of heterozygotes. The discrepancy is highly significant. (x 2 \ = 12.9, 
P < 0.001.) The data suggest that the population was a mixture of sub-populations 
with different gene frequencies. 

4 (5.1) The following pedigree of a single first-cousin marriage will serve to 
illustrate all three relationships. 



The solutions can be got in two ways, by [5.1] or by [5.2]. Applying [5.1] to the 
pedigree: 

(1) The paths are PDAEQ and PDBEQ, giving F x = (D 5 + (i) 5 = 1/16. 

(2) C and G are now also full sibs, so there are two more paths with n = 5, giving 
F x = 4 X (^) 5 = 1/8. 

(3) Let D be the uncle who marries his niece Q. The paths are DAEQ and DBEQ* 
F = 2 X (i ) 4 = 1/8. 

To apply [5.2] to (1) and (2) we need the four coancestries, CE, CG, DE, DG. 
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With (1) all are 0 except DE which is i because D and E are full sibs. F x = Ki) 
— 1/16. With (2) CG is also I; F x = K?) — 1/8. With (3) the coancestries needed 
are AE, AG, BE, BG, of which AE and BE are j, being parent and offspring, and 
the others are 0; F x = 1/8. 

5 (6.1) The range of values is small enough that the observations do not need 

to be grouped into wider classes. To construct the table, make a column of the leaf 
numbers in order, as under X in the table. Then go through the data making a mark 
for each plant against its leaf number. For ease of counting, make each fifth mark 
diagonally through the previous four, making a ‘gate’. Finally count the marks, as 
under n in the table. The calculations of means and variances are shown at the foot 
of the table. The arithmetic is simplified if the leaf numbers are coded as deviations 
from 12, as shown under x. The main difference is that the F 2 is more variable. 


F. F 2 


X 

JC 

jc 2 

n 

rvc 

nx 2 

n 

nx 

nx 2 

12 

0 

0 

— 

_ 

_ 

1 

0 

0 

13 

1 

1 

1 

1 

1 

3 

3 

3 

14 

2 

4 

2 

4 

8 

5 

10 

20 

15 

3 

9 

7 

21 

63 

4 

12 

36 

16 

4 

16 

11 

44 

176 

3 

12 

48 

17 

5 

25 

1 

5 

25 

3 

15 

75 

18 

6 

36 

3 

18 

108 

2 

12 

72 

19 

7 

49 

— 

— 


1 

7 

49 

20 

8 

64 

— 

— 

— 

2 

16 

128 

21 

9 

81 

— 

— 

— 

1 

9 

81 

I 

Mean 

Variance 

— 

— 

25 93 

15.72 ±0.24 
1.46 

381 

25 96 

15.84±0.49 
5.97 

512 


L nx 

Mean = 12 H- 

25 

Standard error of mean 



6 (7.1) Multiply frequency by activity, sum over genotypes, and (if per cent 

frequencies have been used) divide by the total of 100. 

freq. x activity 
AA 1,171.2 

AB 7,438.2 

BB 6,448.4 

AC 515.2 

BC 1,060.0 

Mean = 16,633.0/100 = 166.33 


7 (8.1) The first assumption is that the varieties crossed were homozygous 

at all loci. Tobacco is normally self-pollinating so this is likely to be true. The Fi 
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variance is then wholly environmental in origin. The F 2 variance is both environ¬ 
mental and genetic in origin. The second assumption is that the environmental variance 
in the F 2 is the same as that in the Fj. With this assumption the genetic variance 
is obtained by subtraction. 

F 2 variance = V G + V E = 5.97 
Fi variance = V E = 1.46 

V G = 4.51 


Degree of genetic determination 


V G = 4^1 

V G + V E 5.97 


0.76, or 76 per cent. 


8 (9.1) The children are related as half sibs (see Problem 5.2) so, from Table 
9.3, r = i 

9 (11.1) Use Appendix Table A to get the intensity of selection, i, from the 
proportion selected, p. Then apply [11.3] taking o P as the square root of the 
variance. The working is as follows. 

(1) (a) R = 1.271 X 0.37 x 3.271 = 1.54 g. 

(b) R = 0.798 X 0.37 x 3.271 = 0.97 g. 

(c) When p is greater than 50 per cent, take i for 1 - p and multiply by 
(1 - p)/p. i = 1.271 x 0.25/0.75 = 0.424; R = 0.51 g. 

(2) R = 1.755 X 0.18 X 1.304 = 0.41 day. 

(3) Intensity of selection on females = 1.159. Males are not selected, so i = \ x 
1159 = 0.5795; R = 0.5795 x 0.22 x 2.074 = 0.26 young per litter. When 
selection is for fertility the offspring are already born when the individuals whose 
fertility has been measured are ready to be selected. An alternative way of look¬ 
ing at the process is to regard the offspring of both sexes as being selected on 
the basis of their mother’s fertility. The regression of offspring on mothers is 
2/1 2 , so the response in this instance is R = 1.159 X (} x 0.22) X 2.074 = 
0.26 young per litter. 

10 (17.1) The variances can be transformed to logarithms by [17.2] as follows. 



Small 

Control 

Large 

c 

1 + c 2 

0.143 

0.111 

0.128 

1.0204 

1.0123 

1.0164 

a 2 (logs) X 100 

0.3809 

0.2306 

0.3068 

a (logs) 

0.062 

0.048 

0.055 

Mean of logs 

1.074 

1.362 

1.597 

Response in logs 

0.288 


0.235 


If the responses are equal when transformed to logarithms, the ratio of the arithmetic 
Iteans will be equal: 


L/C = 39.85/23.16 = 1.72 
C/S = 23.16/11.97 = 1.93 
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The asymmetry is much reduced. Alte rnativ ely, the means of log-transformed data 
can be calculated from the formula for (log jc) in [17.1]. The values are given above. 
The responses in log-units are nearly equal. This is to be expected from the fact 
that the response is proportional to the standard deviation ([11.3]), and the standard 
deviations of log-units are nearly equal. 

11 (1.2) Non-tasters are homozygotes for the non-tasting gene. With Hardy— 
Weinberg frequencies, the frequency of homozygotes is the square of the gene 
frequency. So q = V(0.3) = 0.55. 

12 (2.1) First get the coefficient of selection, s, against white-flowered plants. 
The fitness relative to blue is 143/229 = 0.62, and s = 1 — 0.62 = 0.38. White- 
flowered plants are homozygotes and their frequency is q 2 , assuming random 
pollination. From [2.12] the mutation rate is 

u = sq 2 = 0.38 X 7.4 X 10~ 4 = 2.8 X 10“ 4 

Note that the fitness, 0.62, is that of plants as females. The calculation assumes that 
their fitness as males was equally reduced. 

13 (3.2) Calculate the gene frequencies and Hardy—Weinberg expectations 
separately for each race and compare with the observed numbers. The x 2 tests 
agreement between observed and expected numbers. 



P (of a) 

aa 

ab 

bb 

Xi 

Arctic 

0.1214 

18.0 

260.0 

941.0 

1.8 

Coastal 

0.2649 

96.3 

534.3 

741.4 

2.2 


The gene frequencies are different in the two races, so an excess of homozygotes 
is expected in the mixed sample. Each race has genotype frequencies in good agree¬ 
ment with the Hardy—Weinberg expectation for its own gene frequency. 

So the subdivision into these two races, each having mated at random within the 
race, is sufficient to account for the genotype frequencies in the total sample. 

14 (5.2) The twins are genetically equivalent to a single individual. The two 
marriages are therefore equivalent to one individual with two spouses, so the children 
are related as half sibs and their coancestry is 1/8. (See [5.7].) If worked out from 
the pedigree by [5.1] the pair of twins must be shown by a single individual in the 
pedigree. 

15 (6.2) The data have to be grouped into classes. Grouping by 1 g intervals 
makes 15 classes, which is satisfactory. The classes, X , are tabulated by the integral 
part of the weight, i.e. ignoring the decimal part. The class intervals are 4.0—4.9, 
5.0—5.9 etc., and the class mid-points are 4.45, 5.45 etc. The mean is therefore 
0.45 + X. Different groups will give slightly different estimates of the mean 
and variance. 


X 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 

n 1251001 6 4 8 10 7 2 2 1 
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All 

Pygmy 

Normal 

Mean 

12.61 

6.12 

14.04 

Variance 

12.34 

0.75 

3.40 


The distribution is bimodal, with no overlap. The 9 very small mice are clearly 
distinct from the others in some way. In fact, they were homozygous for a dwarfing 
gene called pygmy, which is used for examples in Chapters 7 and 8. The presence 
of the pygmy gene greatly increases the variance. 

16 (7.2) This can be done by use of [7.2], but it is probably simpler to cal¬ 
culate the Hardy—Weinberg frequencies and get the mean by summing activity X 
frequency. 



Activity 

Frequency 


(1) 

(2) 

(3) 

AA 

122 

0.04 

0.25 

0.64 

AB 

154 

0.32 

0.50 

0.32 

BB 

188 

0.64 

0.25 

0.04 

Mean 


174.48 

154.50 

134.88 


17 (8.2) This only requires the substitution of a, d and q in [8.3a] and [8.4]. 
The values of a and d were found in solving the Chapter 7 problems. 




a 

d 

q 

2 pq 

V A = 2pqa 2 

1 

<Nf 

II 

(1) 

(1) 

33.6 

-1 

0.2 

0.32 

361.27 

0.10 


(2) 

33.0 

-1 

0.5 

0.50 

544.50 

0.25 


(3) 

32.4 

~1 

0.8 

0.32 

335.92 

0.10 

(2) 


0 

50 

0.4 

0.48 

0 

576.00 

(3) 

Gene b: 

2.5 

2.5 

0.5 

0.50 

3.125 

1.56 


Gene c e : 

11.4 

28.5 

0.2 

0.32 

41.59 

83.17 


Both genes: 





44.71 

84.74 


18 (9.2) The solution to Problem 8.2 (3) gave V A = 44.71 and V D = 84.74, to 
which we have to add V E = %(V A + V D ) = 43.15. Adding the three components 
gives V P = 172.60. From the covariances in Table 9.3 we then get 

(1) Regression of offspring on mid-parent = V A /V P = 0.259 

(2) Regression of offspring on one parent = \V A /V P = 0.130 

(3) Full-sib correlation = Qy A + kV D )IV P = 0.252 

(4) Half-sib correlation = \V A /V P = 0.065 

(5) Double first-cousin correlation = (\V A + \<y D )IV P = 0.095 
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19 (11.2) For reasons to be explained in Chapters 19 and 20, this question 
cannot be answered without making an important assumption, namely that the cause 
of the selective survival was the bill depth itself and not some other character cor¬ 
related with it such as, for example, wing length. The assumption seems a reasonable 
one in the circumstances described. With this assumption, then, the selection dif¬ 
ferential is S = 9.96 — 9.42 = 0.54 mm. To predict the response we need to know 
the heritability, which will be taken to be 0.82 from Problem 10.8. The predicted 
response, by [11.2], is R = 0.82 X 0.54 = 0.44 mm, an increase of 5 per cent. 
The predicted mean in the progeny of the survivors is 9.42 + 0.44 = 9.86 mm. 
The assumption made is the subject of Problem 20.1. 

20 (17.2) Calculate the heterosis as the difference between the cross and the 
mid-parental value. Do this first for the arithmetic values given. Then convert all 
the arithmetic values given to logarithms and calculate the heterosis from these. The 
heterosis on the two scales is shown below. We cannot use [17.1] to evaluate the 
mean of logarithmic values because the coefficients of variation are not given, so 
the scale transformation has to be done less accurately by taking the logarithms of 
the arithmetic means given. Logarithms to base 10 are used here. Natural logarithms, 
log*, could equally well have been used. The relationship between the two is 
logic x — 0.4343 log* x. 



Arithmetic 



Logarithmic 



L 

C 

S 


L 

C 

S 

L 

1.94 

1.135 

0.025 

L 

0.028 

0.025 

0.023 

C 

— 

0.92 

0.53 

C 

— 

0.019 

0.019 

S 

— 

— 

-0.07 

s 

— 

— 

-0.002 


On the arithmetic scale the heterosis varies greatly according to the size of the lines 
crossed; it is not easy to see if crosses between size groups differ from crosses within 
size groups. On the logarithmic scale the heterosis is nearly the same in all crosses, 
both between and within size groups, except the S X S cross which is anomalous 
on both scales. The difference in the absolute magnitude of the heterosis on the two 
scales has no meaning. 

21 (1.3) For an approximate answer the working is easier if frequencies are 
expressed as fractions. First get the gene frequency from q 2 = 1/20,000; 
q = 1/141. Carriers are heterozygotes, with frequency 

2^(1 - q) = 2 X —X 

141 141 

2 

=- (approx.) 

141 

= 1 in 70 (approx.) 

22 (2.2) Homozygotes for white will be extremely rare and can be neglected. The 
frequency of white-flowered plants is then the frequency of heterozygotes, H. By 
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[2.15] the mutation rate from blue to white is v = sH/2 = k (0.38 X 7 4 x 10~ 4 ) 
= 1.4 x 10~ 4 . 

23 (3.3) N = 40. By [3.7], A F = 1/80 = 0.0125 
By [3.12], Fit = 5) = 1 - (0.9875) 5 = 0.061 

Fit = 10) = 1 - (0.9875) 10 = 0.118 

24 (5.3) First redraw the pedigree showing the paths of transmission more 
clearly, and put in the parents of generation I, thus: 


A B C D 



There are 6 paths, 2 through the common ancestors E and F, with n = 5, and 4 
through A, B, C, and D, all with n — 7. None of the six common ancestors is inbred. 
By [5.1] 

F x = 2(1) 5 + 4(D 2 

= a ) 4 + a ) 5 

= + 32 

_ 1 
— 32 

= 0.09375 

The parent, Q, is the child of a double first-cousin marriage, with F = i, as seen 
in Problem 5.1. Note that the inbreeding of the parent does not affect the inbreeding 
of the children. 

25 (6.3) There are 5 genotypic classes, with measurement values of -4, -3, 
-2, -1,0, according to the number of loci that are homozygous for the recessive 
allele. The frequencies of these classes are given by the terms of the binomial ex¬ 
pansion of (a + b) 4 , where a is the probability of being homozygous at any par¬ 
ticular locus. Here a = (0.3) 2 = 0.09, and b = 0.91. The terms of the binomial 
expansion are 

a 4 4 a 3 b 6a 2 b 2 4ab 3 b 4 

The coefficients, 1, 4, 6, 4, 1, are most easily got from Pascal’s triangle with 
n = 4, given below up to n = 6. 
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n 





1 



1 




1 


1 


2 



1 


2 


1 

3 


1 


3 


3 

1 

4 

1 


4 


6 


4 1 

5 

1 

5 


10 


10 

5 1 

6 

1 6 


15 


20 


15 6 1 


The required frequencies, to 3 decimal places, are 


Measurement class — 4 —3 —2 —1 0 Total 

Frequency_0.000 0.003 0.040 0.271 0.686 1.000 


The extreme asymmetry is the consequence of the low frequency of the recessive 
homozygotes. 

26 (7.3) This can be done in three ways, (i) Calculate M from [7.2] for several 
different gene frequencies and graph M against q. This, of course, gives only an 
approximate answer. 

(ii) Treat the metric value as if it were fitness and use [2.19] to find the 
equilibrium gene frequency. For this we need the fitness of each homozygote relative 
to the heterozygote. (1 - Sl ) = 110/150 = 0.733, (1 - s 2 ) = 90/150 = 0.600; 

= 0.267, s 2 = 0.400. Then, if q is the equilibrium frequency of At, by [2.191 
q = 0.267/0.667 = 0.4. 

(iii) Differentiate [7.2] with respect to q and equate to 0. (In using [7.2] 
care must be taken to ensure that p is the frequency of the allele that confers the 
higher value, in this case Aj.) Substituting p = 1 — q in [7.2] and rearranging gives 

M = a + 2{d — a)q — 2 dq 2 

dMIdq =2 (d — a) — 4dq 

„ d — a 


The mid-homozygote value is 100, and a = 10, d = 50. Thus q = 40/100 = 0.4. 

Substituting these values of a, d and q in [7.2] gives M — 26 as the deviation 
from the mid-homozygote value. The population mean is therefore 100 + 26 = 126. 

27 (8.3) The ratio V D IV G is [8.4] divided by [8.8]. Let c be the degree of 
dominance, so that d = ca. Then 2 pqa 2 cancels out and the ratio reduces to 

Vp _ _ 2 pqc 2 _ 

V G 2pqc 2 + [1 + c(q - p)] 2 

where q is the frequency of the recessive allele. Before working this out for different 
gene frequencies it is easier to substitute the value of c and then simplify further. 
Simplified expressions are given below, with the values of V D IV G for four gene 
frequencies. More values will have to be calculated for drawing the graphs. 
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d = 

d = a 

d = 2a 


q~q 2 

1 - q 

q ~ q 2 

q 

q 2 + 3q + 2 

1 + q 

q 2 + k 

0.2 

0.140 

0.667 

0.970 

0.4 

0.129 

0.429 

0.842 

0.6 

0.090 

0.250 

0.495 

0.8 

0.045 

0.111 

0.209 

Maximum 

0.143 

— 1 

1.000 

at q = 

0.250 

—0 

0.250 


Note that most of the variance caused by a fully recessive allele at low frequency 
is dominance variance; but when dominance is less than complete the proportion 
of dominance variance is not great. 

28 (9.3) The father—child and mother—child correlations are obviously 
consistent. To make comparisons with the midparent—child correlation we have to 
convert the correlations to regressions. In doing this we have to make two assump¬ 
tions: that the variances in parents and in children are the same, and that the children 
are single children, not the means of several. With these assumptions, the regres¬ 
sions of children on single parents are equal to the correlations and are estimates 
of W A IV P and the estimates of V A IV P are 1.00 and 0.98. The midparent-child 
correlation estimates (y/%)V A /V P , giving V A /V P = 0.69/V^ = 0.98. The three 
correlations are therefore all consistent. 

29 (11.3) The intensity of selection is obtained from Appendix Table B; it is 
different in the two sexes. For males, n = 4, N = 60, giving i m = 1.882. For 
females, n = 8, N = 60, i f = 1.582. The mean (see [11.6b]) is i = 1.732. The 
predicted response per generation, from [11.3], is R = 1.732 x 0.81 X 111 = 
155.7 g. 

If tHe response continued at the same rate the mean after five generations would 
be 738 + (5 X 155.7) = 1,517 g. With such a high heritability, however, the reduc¬ 
tion of variance from selection is not negligible. A very rough prediction can be 
made by taking the mean proportion selected, i.e. 10 per cent. Interpolation in Table 
11.2 then gives R 2 = 0.78 x /?, = 121 g. Ignoring the small subsequent reduc¬ 
tion of the response, the predicted mean after five generations is 738 + 155.7 + 
(4 X 121) = 1,378 g. 

30 (17.3) Since the effect increases with the mean we might first try a log- 
transformation; or more simply, look at the ratio H/L to see if it is constant. The 
difference between the X chromosomes after transformation to logs is given on the 
left below. It is a considerable improvement over the arithmetic difference, but it 
still increases with the mean. We therefore need a stronger transformation. This 
could be achieved by subtracting some constant before transformation to logs. One 
might guess that the 4 larger bristles should be discounted on the grounds that they 
Eire nearly invariant. The transformation would then be to log (jc — 4), where x is 
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the bristle number as counted. This transformation works well, as shown below, 
and renders the X chromosome effect independent of the autosomal level. 



Difference 
of logs 

Log (x 

- 4) 



High 

Low 

Difference 

A 

0.088 

0.740 

0.574 

0.166 

B 

0.090 

0.970 

0.835 

0.135 

C 

0.143 

1.272 

1.092 

0.180 

D 

0.147 

1.490 

1.319 

0.171 

E 

0.164 

1.641 

1.459 

0.182 


The required transformation can be arrived at more rationally as follows. Plot 
the High against the Low arithmetic values, on arithmetic paper. The points lie nearly 
on a straight line. We are looking for a scale on the axes which will make the line 
pass through the origin, i.e. zero on both axes. This will give a graph of the form 
y = bx, so that the ratio y/x is constant. If the line plotted is extended downwards 
it will be found to pass through, or close to, the point H = 4, L = 4. This means 
that (H —4)/(L —4) is constant, and the required transformation is log (jc — 4). 


31 (l 4) Heterozygotes = 2 pq = 2q(l - q ) = 2 q 

Normals 1 — q 2 (1 + q)( 1 — q) 1 + q 

Putting 

= , 

i + q 

gives the solution, q = 5 = 0 . 2 . 

{Note: Remember that when the gene frequencies of two alleles are written as p 
and q, p + q = 1, and so p = 1 — q. In doing the algebra of gene frequencies 
it is usually best to substitute p = 1 — q, or q = 1 — p, as a first step.) 

32 (2.3) By [2.4] the gene frequency of a will be 

10“ 4 /(10 " 4 + 10 ~ 5 ) = 1/(1 + 10 -1 ) = 10/11 = 0.90909 

The Hardy—Weinberg frequencies are then AA = 0.0083, Aa = 0.1653, 
aa = 0.8264. 


33 (3.4) The expected mean is the gene frequency in the base population, i.e. 
0.3. This would be found if the number of students was large. 

The sample subjected to electrophoresis can be regarded as a sixth generation of 
parents, so that there have been six generations of random drift. First get F{t — 
6 ), as in Problem 3.3, but with N = 20. This gives F = 0.141. Then, by [3.14], 
the variance of the gene frequencies would be o 2 q = 0.3 X 0.7 X 0.141 = 0.030. 
The standard deviation of the students’ estimates would be V(0.030) = 0.17. 

34 (5.4) First get the inbreeding coefficient by [5.15] as 
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1 - 0.01 

1 + 0.01 


0.9802 


Then, by [3.15], the frequency of heterozygotes relative to the Hardy—Weinberg 
frequency is 

H t /H 0 = 1 - F = 0.0198 
The Hardy—Weinberg frequency is 
H 0 = 2 x 0.2 X 0.8 = 0.32 

The required frequency of heterozygotes in the population is therefore 
H t = 0.0198 X 0.32 = 0.63 per cent 

35 (6.4) The binomial frequencies have first to be worked out for the two 
kinds of loci separately. In each case there are 3 genotypic classes, with binomial 
frequencies a 2 lab b 2 , where a and b are as follows. 

Loci with q = 0.3: a = 0.09, b = 0.91 
Loci with q = 0.7: a — 0.49, b = 0.51 

The binomial frequencies and genotypic classes of the two kinds of loci are shown 
in the margins (top and left) of the following table. 



The two kinds of loci are put together in the body of the table. The genotypic class 
is got by adding the two marginal classes, and the frequency is got by multiplying 
the two marginal frequencies. There are five genotypic classes. Adding together 
the frequencies in the cells representing the same class gives the frequency distribu¬ 
tion as follows. 


Measurement class —4 —3 —2 —1 0 Total 

Frequency 0.002 0.043 0.283 0.457 0.215 1.000 


36 (7.4) The genotypic values as defined in Fig. 7.1 are a b = K95 — 90) = 
2.5; a c = K95 - 38) = 28.5. (Subscript c denotes the c e gene.) In both cases 
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d — a, and [7.2] then reduces to M = a(l — 2q 2 ), where q is the frequency of 
the recessive and reducing allele. This gives 

M h = 2.5[1 - 2(0.5) 2 ] = 1.25 * and M c = 28.5[1 - 2(0.2) 2 ] = 26.22 

Putting both genes together as in [7.3], M = 1.25 + 26.22 = 27.47. This is the 
deviation from the mid-point of the two double homozygotes. The value of the double 
dominant homozygote is 95 as stated. Given that the gene effects are additive, the 
value of the double recessive homozygote is 

95 - 2 a b - 2a c = 95 - 5 - 57 = 33 

The mid-homozygote value is therefore K95 + 33) = 64. This makes the mean 
granule number 64 + 27.47 = 91.47. 

An alternative and quicker way to get the mean is from the values and frequencies 
of the four genotypes, as shown below. The values, i.e. granule numbers, entered 
are those given in Example 7.3. 


Genotype 


B- 

bb 


Freq. 

0.75 

0.25 



95 

90 

C- 

0.96 

0.72 

0.24 



38 

34 

c e c e 

0.04 

0.03 

0.01 


Mean = (95 X 0.72) + (90 X 0.24) + (38 


X 0.03) + (34 x 0.01) 


91.48 


37 (8.4) We have to get V A by [8.3a] and V D by [8.4] for each gene separately 
and then add them together. The genotypic values, a and d, for doing this must be 
taken from the means in the margins of table (ii) in the solution to Problem 7.9. 


Gene b: a = d = X0.22S + 0.342) = 0.285; q = V(0.4) = 0.6325 
Gene c e : a = d = K0.060 + 0.240) = 0.150; q = V(0 .2) = 0.4472 


Gene b: a 
Gene c e : a 


Gene b 
Gene c e 
Both genes 


0.285(1 + 0.6325 - 0.3675) = 0.3605 
0.150(1 + 0.4472 - 0.5528) = 0.1342 


V A = 2pqot 
0.0604 
0.0089 
0.0693 


V D - (2pqd) 
0.0176 
0.0055 
0.0231 


The interaction variance, V h is calculated directly from the interaction deviations 
in table (iv) of the solution to Problem 7.9. The values in the table are deviations 
from the population mean, so their variance is simply the mean of their squares. 
To get Vj, therefore, multiply the square of each interaction deviation by the 
frequency of the genotype and add: 
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Vj - 0.48(0.04) 2 + 0.32(—0.06) 2 + 0.12(—0.16) 2 + 0.08(0.24) 2 = 0 0096 
V A = 0.0693 = 67.94% 

V D = 0.0231 = 22.65% 

Vj = 0,0096 = 9.41% 

V G = 0.1020 = 100,00% 

To check, work out V G directly from the values in table (ii) of Problem 7.9. The 
variance of the additive expectations in table (iii) gives V A + V D . 

38 (9.4) With standardized litters the full-sib correlation is a little higher than 
the daughter—dam regression. This could be due to dominance or common environ¬ 
ment or both. The size of the litter in which a female is reared affects her own subse¬ 
quent litter size for the following reason. Females reared in larger litters have to 
share their prenatal and pre-weaning nutrition with a larger number of competitors 
and they are consequently smaller at weaning and as adults. Being smaller, they 
ovulate fewer eggs and have smaller litters. Sibs reared as litter mates share the 
environment of the litter in which they were reared. The variation of litter size when 
litters are not standardized thus causes environmental covariance which increases 
the full-sib correlation, as seen in the data. The size of litter in which a female is 
reared is the mother’s litter size. So, when litters are not standardized, mothers with 
larger litters tend to have daughters with smaller litters. There is therefore a negative 
environmental covariance of daughters’ with mothers’ litter sizes. This counter¬ 
balances the positive genetic covariance and the resultant daughter—dam regression 
is nearly zero when litters are not standardized. Note how this maternal effect causes 
a positive environmental covariance of litter mates but a negative environmental 
covariance of offspring with parents. 

39 (11.4) The selection differential in each generation has first to be calculated 
as S = P — M, This is the selection to which the response is seen in the next genera¬ 
tion. Each selection differential is therefore entered in the table against the progeny 
generation. Generation 0 has had no selection applied to it. The figures in the table 
under R, for response, are simply the generation means. Next, the selection dif¬ 
ferentials are added successively to give the cumulated selection differential as shown 



High 



Low 



Divergence 

Gen . 

R 

S 

E 5 

R 

S 

E 5 

R 

E S 

0 

2.16 

0 

0 

2.16 

0 

0 

0 

0 

1 

2.26 

0.16 

0.16 

2.06 

-0.14 

-0.14 

0.20 

0.30 

2 

2.26 

0.08 

0.24 

2.03 

-0.06 

-0.20 

0.23 

0.44 

3 

2.33 

0.11 

0.35 

2.02 

-0.06 

-0.26 

0.31 

0.61 

4 

2.45 

0.08 

0.43 

2.05 

-0.06 

-0.32 

0.40 

0.75 

5 

2.44 

0.02 

0.45 

2.01 

-0.04 

-0.36 

0.43 

0.81 

Reg. R on E S 
Total response 

Total selection 


0.631 

0.62 



0.362 

0.42 


0.512 

0.53 



374 


Solutions 


under E S. For the divergence, the values of R and E S in each generation are got 
by subtracting those of the Low line from those of the High line. The realized 
heritability is estimated by the regression of R on E S, with the values for genera¬ 
tion 0 included. Alternatively, it can be estimated, though less reliably, as the ratio 
of the total response to the total selection. For the High line this is (2.44 — 2.16)/0.45 
= 0.62. 

40 (19.1) Heritability (see Table 10.4) 


0.83 

0.71 

Environmental correlation: 

22,848 - (4 x 2,229) 

r — - = 0.94 

V( [12,321 - (4 X 1,602)][61,504 - (4 X 6,150)]} 

Alternatively, the environmental correlation can be calculated from [19.1] for which 
the following are needed 


h G = 0.7211 


= 0.6928 

h F = 0.6325 

e F 

= 0.7746 

/iq/i f = 0.4561 

e G e F 

= 0.5366 

Substituting into [19.1] gives 


0.83 = (0.71 x 
r E = 0.94 

0.4561) + 

(r E X 0.5366) 


41 (1.5) The genes can be counted and their frequencies determined by extension 
of [1.1]. 

A: 0.096 + K0.483 + 0.028) = 0.3515 

B: 0.343 + K0.483 + 0.050) = 0.6095 

C: 0 + K0.028 + 0.050) = 0.0390 

Total 1.0000 

The Hardy—Weinberg expectation of CC is (0.0390) 2 = 0.0015. The expected 
number in a sample of 178 is 0.27. None was found because the expectation was 
well below 1. 

42 (2.4) None. The equilibrium by [2.4] is the same when both rates are 
increased by the same proportion. 


Weight gain: 

Food consumption: 
Phenotypic correlation: r P = 
Genetic correlation: r A ~ 


- 1,602 
hri = 4 x - = 0.52 


12,321 

, 6,150 

hp = 4 x - = 0.40 

61,504 

22,848 _ 

V(12,321 X 61,504) ~~ 
2,229 _ 

V( 1,602 x 6,150) 
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43 (3.5) Problem 3.4 gave o 2 q = 0.03. The initial gene frequencies were 0.3 and 
0.7. Then by [3.5] the genotype frequencies will be 

AjAj: 0.09 + 0.03 = 0.12 
A^: 0.42 - 0.06 = 0.36 
A 2 A 2 : 0.49 + 0.03 = 0.52 


44 (5.5) The solution comes from [5.15] rearranged to give C in terms of F . 
F is got by [3.15], but for this we need the Hardy—Weinberg frequency of 
heterozygotes, H 0 . The gene frequencies, by [1.1] are 0.6 and 0.4. So H 0 = 0.48. 
Then, by [3.15], 

1 — F = 0.12/0.48 = 0.25; 

F = 0.75. 


[5.15] rearranged becomes 


1 - F _ 0.25 
1 + F ~ 1.75 


0.143 


The indicated frequency of cross-pollination is 14 per cent. 

45 (6.5) To get the frequencies from the binomial expansion in this case we 
have to work with genes, not genotypes. Because there is no dominance, the measure¬ 
ment is determined by the number of increasing alleles, each one adding 1 unit of 
measurement. With 3 loci, the genotype can have from 0 to 6 increasing alleles, 
making 7 classes. The frequencies are given by the expansion of (p + q) 6 , where 
p and q are the gene frequencies, 0.4 and 0.6. 


Measurement class 0 1 2 3 4 5 6 

Frequency p 6 6p 5 q 15 p 4 q 2 20p l q 3 1 5p 2 q 4 6pq 5 q 6 Total 

Frequency, % 0.4 3.7 13.8 27.6 31.1 18.7 4.7 100.0 


46 (7.5) It is best to get the average effect of the gene substitution, a, first 
from [7.5]. For this we have to evaluate a and d as defined in Fig. 7.1. Taking 
the values of the enzyme activities of the genotypes AA, AB and BB as given in 
Problem 7.1 , a and d are calculated as 

a = K188 - 122) = 33 
d = 154 - K188 + 122) = -1 

In reality there is probably no dominance, i.e. d = 0, but the solution will be worked 
out with d = — 1 as if it were real. The gene frequencies, q, specified in Problem 
7.2 are those of allele A which is the allele conferring the lower value, and this 
is what is required in [7.5]. 

(1) q = 0.2. Substitution into [7.5] gives 

a = 33 + [-1(0.2 - 0.8)] = 33 4- 0.6 = 33.6. 

The average effects of the alleles separately are given by [7.6]. Here refers to 
the allele whose frequency is p, which is B. So, from [7.6] 
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a B = 0.2 x 33.6 = 6.72 
a A = -0.8 X 33.6 = -26.88 

Check that ag — ct A = a 

(2) q = 0.5: a = 33 + 0 = 33; a B = 16.5; a A = —16.5. 

(3) q = 0.8: or = 33 — 0.6 = 32.4; a B = 25.92; a A = —6.48 

(Note: with strictly no dominance (d = 0), a is equal to a and is the same for all 
gene frequencies, but a A and cc B remain dependent on the gene frequency.) 

47 (8.5) For (1) we need the correlation, r, between first and second litters, and 
for (2) we need the regression, b, of second on first, and the two means. 

Let X represent first litters and Y represent second litters. 

E X = 104 E Y = 103 

E X 2 = 1,106 EL 2 =1,101 EAY = 1,089 

X = 10.4 Y = 10.3 

The corrected sums of squares and products are 

Ex 2 = 24.4 E y 2 = 40.1 E xy = 17.8 

The repeatability is estimated as the product moment correlation, which is 

r = 17.8/V(24.4 x 40.1) = 0.57 

The estimate from this small sample is higher than would normally be found in larger 
samples. The repeatability could be estimated as the intraclass correlation, which 
works out to be 0.59, but this is not strictly valid when the variances of X and Y 
differ, which they do: o\ — 2.7, o\ = 4.5. 

(2) The regression of second on first litter sizes is 

byx = 17.8/24.4 = 0.73 

Expected size of second litters: 

(a) 10.3 + 0.73(14 - 10.4) = 12.9 

(b) 10.3 + 0.73(5 - 10.4) = 6.4 

These are the predicted mean sizes of the second litters of (a) all mice whose first 
litters were 14 and (b) all mice whose first litters were 5. 

48 (10.1) According to [10.5] the correlation or regression has to be multi¬ 
plied by 1/r, the values of r being given in Table 9.3. In the cases of (4) and (7) 
the factors analogous to 1/r are explained in the text and are shown in parentheses 


below. 

1/r 

h 2 

(1) 

2 

0.42 

(2) 

2 

0.54 

(3) 

2 

0.68 

(4) 

(1) 

0.32 

(5) 

4 

0.08 
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(6) 4 0.12 

(7) (2) 0.18 

49 (11.5) The unweighted selection differential is the difference between the 
mean of the parents and the mean of their generation; the calculation is shown in 
the table. Note that selection was for reduced size, so a selection differential with 
negative sign is what was desired. To get the weighted selection differential calculate 
the weighted mean of the parents, weighting by the number of offspring. If P is 
the parental value and n is the corresponding number of offspring, the weighted 
mean is L nPIT. n. For females this is 

[(7.6 x 1) + (12.4 x 9) + . . .]/69 = 917.6/69 = 13.30 
The rest of the calculation is shown in the table. 



Unweighted 

Females 

Males 

Weighted 

Females 

Males 

Mean of parents (a) 

12.59 

13.01 

13.30 

13.75 

Mean of parents’ generation (b) 

13.14 

14.80 

13.14 

14 80 

Selection differential (a—b) 

-0.55 

-1.79 

+0.16 

-1.05 

Mean of sexes 

-1.17 


-0.445 


Natural selection for fertility in females opposed the artificial selection for small 
size, to an extent that the effective selection differential was in the wrong direction. 
The similar but much smaller effect in males was probably accidental. 

50 (19.2) The variances and covariance in the uniform population are 
environmental only. Subtraction of these from the values in the variable population 
estimates the genotypic variances and covariance (see Example 8.1). 


Population 

Cause of variation 

Correlation 

Variable 

G + E 

0.87/V(0.366 X 43.4) = 0.22 = r P 

Uniform 

E 

0.27/V(0.186 X 16.6) = 0.15 = r E 

v-u 

G 

0.60/V(0.180 x 26.8) = 0.27 = r G 


The genetic correlation, r G , estimated in this way is the correlation of genotypic 
values. The genetic correlation, r A , in Problem 19.1 was estimated from the sire 
components and is the correlation of breeding values. 

51 (1.6) The gene frequency in males is the frequency of affected males, given 
as 0.07. Under Hardy-Weinberg equilibrium the gene frequency is the same in 
females as in males. (1) The frequency of heterozygous women is 

2q(l - q) = 2 x 0.07 x 0.93 = 0.13, 

or 13 per cent. (2) The frequency of colour-blind (i.e. homozygous) women is q 2 
- 0.0049, or about 1 in 200. (3) The frequency of the marriage is the product of 
die frequencies in men and in women, 0.07 X 0.0049 = 0.000343, or about 1 in 
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52 (2.5) The selection coefficient, originally s = 1, is halved by the treatment. 
At equilibrium, present and future, the frequency of homozygotes, by [2.13], is q 2 
= u/s. Halving s will double the frequency of homozygotes when the new equilibrium 
is reached. The increase of gene frequency comes from mutation so it will take a 
very long time to reach the new equilibrium. 


53 (3.6) The observed frequency of heterozygotes is H t = 0.1343. The gene 
frequencies in the population as a whole, by [1.1], are 0.1413 and 0.8587. The 
frequency of heterozygotes in a single random-breeding population with these gene 
frequencies is H 0 = 0.2427. The panmictic index, by [3.15], is 


0.1343 

0.2427 


= 0.553 


and the coefficient of inbreeding is 
F = 0.447 


54 (5.6) In the first period A F = 1/32 by [4.1]. After 10 generations 1 - F 
= P = [31/32) 10 = 0.728 by [3.11]. In the second period AF = 1/64. At the end, 
referred to the beginning of the second period as base, P = (63/64) 10 = 0.854. 
Referred to the start of the first period as base, by [5.17], 

P = 0.854 x 0.728 = 0.622 
F = 0.378 

equivalent to two generations of full-sib mating (Table 5.1). 

55 (6.6) A symmetrical distribution results when a = b = jin(a + b) n . 
(1) With recessives, a = q 2 , where q is the frequency of the recessive alleles; 
q = y/j = 0.707. (2) With no dominance, a = q; q = 0.5. 

56 (7.6) The breeding values are obtained from Table 7.3 where A]A[ 
corresponds with BB. The values of a needed were found in Problem 7.5. 



(1) 

q = 0.2 
a = 33.6 

(2) 

q = 0.5 
a = 33.0 

(3) 

q = 0.8 
a = 32.4 

BB 2 qa 

13.44 

33 

51.84 

AB {q — p)a 

-20.16 

0 

19.44 

AA — 2pa 

-53.76 

-33 

-12.96 


These values are deviations from the population mean. Check that the mean breeding 
value is zero in each population: multiply the breeding value by the genotype 
frequency (Hardy—Weinberg) in that population and sum over genotypes. 

57 (8.6) (1) The mean square for litter order is not relevant except to show that 
the variance between litters of different order (first, second, etc.) has been removed 
from the within-sow variance, i.e. from V Ew . The repeatability is given by the intra- 
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class correlation, r = oil (cl + a 2 ), where a 2 b is the component between 
sows and a 2 is the component within sows. The mean square within sows is a 2 . 
The mean square between sows is composed of c\ + 10a*. (The 10 is because 
each sow had 10 litters.) Thus 

c\ = (25.56 - 3.23)/10 = 2.233 
r = 2.233/(2.233 + 3.23) = 0.409 

Estimation of the repeatability by the intraclass correlation assumes that the variances 
do not differ in successive litters. 

(2) Basing the measure on the mean of more than one litter reduces V P (see 
[8.14]), and the relative reduction is given by [8.15]. V A is unchanged. The 
heritability when only one litter is used is V A !V P , and when n litters are used it is 
^A^p(n)- The heritability with n litters relative to the heritability with one litter 
is therefore V P /V P{n) , which is simply the reciprocal of [8.15]. 


n 

v P(n /v P 

h 2 Jh 2 

2 

0.704 

1.42 

3 

0.606 

1.65 

4 

0.557 

1.80 


Taking two litters would increase the heritability by 42 per cent, and taking four 
litters by 80 per cent. 

58 (10.2) The inconsistency between sons and daughters is removed if 
correction is made for the difference in variance between males and females. The 
correction factors, by which the regressions and their standard errors are to be 
multiplied, are: for daughter-father, ojoj = 2.5/2.3 = 1.087; for son—mother, 
Cf/c m = 0-920. The estimates of the heritability, obtained by making this correc¬ 
tion and then doubling the regression and its standard error, are as follows. 



Father 

Mother 

Sons 

0.646 ± 0.116 

0.835 ± 0.105 

Daughters 

0.633 ± 0.096 

0.840 ± 0.096 

Mean 

0.64 ± 0.075 

0.84 ± 0.071 


The estimates from mothers are substantially higher than those from fathers, which 
can be attributed to a maternal effect. Consequently the regressions on mid-parent 
are not useful, and the regressions on fathers provide the most reliable estimates. 
There are no reasons to prefer the sons or the daughters, so we take the mean as 
h 2 = 0.64 ± 0.075. The standard error is obtained as h/[(0.116) 2 + (0.096) 2 ]. 

Now consider the regressions on mothers. By doubling the regression, as in the 
table, we have also doubled the covariance due to the maternal effect. Therefore 
5(0.84 — 0.64) - 0.10 estimates the environmental covariance of mothers and 
their children, expressed as a proportion of V P . 

59 (11.6) On average, 2 lambs from each ewe must be selected in order to replace 
the parents. After one breeding season there are not enough lambs to provide 
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replacements, so the parents must be kept for at least 2 seasons. After 2 breeding 
seasons each ewe has on average 2.4 lambs; selecting 2 out of 2.4 (p = 83 per cent) 
makes i = 0.305 (from Appendix Table A because the numbers are large). The 
generation interval is the mean age of the parents at the birth of their selected off¬ 
spring, which in this case is 2.5 years. The intensity of selection per year is then 
0.122. The table shows this calculation for each year of age at which the parents 
might be discarded, up to 7. The intensity of selection per year is maximal when 
the parents have bred for 5 seasons and are 6 years old. The parents should therefore 
be discarded after their 5th breeding season. The proportion selected in each year will 
then be 1 out of 3 for the following reason. Each year 1 out of 5 pairs are replaced, 
having produced on average 1.2 lambs in that year. The proportion of the total lambs 
needed to replace them is therefore 

J_ _ 2 _ = _1 

5 1.2 _ 3 

Or, when due for replacement each pair has produced on average a total of 6 lambs. 
So to replace each pair, 2/6 = 1/3 must be selected. 


Age of parents 
when discarded 

Generation 
interval (L) 

Total lambs 
per ewe (N) 

-/ 

p = UN 
(%) 

i 

i/L 

3 

2.5 

2.4 

83 

0.305 

0.122 

4 

3.0 

3.6 

56 

0.704 

0.235 

5 

3.5 

4.8 

42 

0.931 

0.266 

6 

4.0 

6.0 

33 

1.097 

0.274 

7 

4.5 

7.2 

28 

1.202 

0.267 


60 (19.3) Taking the intensity of selection from Appendix Table A gives 

Males: i = 1.755 

Females: / = 1.400 

Mean: i = 1.5775 

The direct responses are predicted by [11.3] and the correlated responses by [19.6]. 
The data needed from Problem 19.1 are 

hi = 0.52 hi = 0.40 r A = 0.71 

0G = HI g a F = 248 g 

giving, for substitution into [19.6], h G h ¥ r A = 0.324. 

Response of weight gain 

Direct: R = 1.5775 x 0.52 x 111 = 91.1 g 

Correlated: CR = 1.5775 X 0.324 X 111 = 56.7 g 

Response of food consumption 

Direct: R = 1.5775 X 0.40 X 248 = 156.5 g 

Correlated: CR = 1.5775 x 0.324 X 248 = 126.8 g 
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61 (1.7) The gene frequency in the parents is known and should be used to calculate 
the expectations. The frequency of so was 8/40 = 0.2. The Hardy—Weinberg 
genotype frequencies in the progeny are, by [1.2], 0.04, 0.32, 0.64. Multiplying 
these by the total number counted, 1,441, gives the expected numbers (to the nearest 
whole number) as 

so/so so/cn cn/cn 

58 461 922 

There was an excess of homozygotes and a corresponding deficiency of heterozygotes. 
Possible reason: assortative mating due to some of the female parents having mated 
with their own stock males before being put in the vials. The discrepancy is highly 
significant, with \ 2 — 125. This x 2 has two degrees of freedom because the 
observed numbers of progeny were not used to estimateihe gene frequency, the 
only constraint being that the expected numbers must add to the observed total. 


62 (2.6) ql = 1/2,500 = 0.0004, q 0 = 0.02. By [2.12] the original mutation 
rate is w 0 = sql, and with s = 1, u 0 = ql. The change of gene frequency from 
mutation is given by [2.3] and from selection, approximately, by [2.8]. Putting M] 
for the new mutation rate, the net change is 

A q = w,(l - q 0 ) - sql(l - q Q ) 

= (w, - sql)( 1 - q 0 ) 

Substituting sql = u 0 

A q = (Mi - m 0 )(1 - q 0 ) 

Putting U] = 2u 0 , and substituting u 0 = ql 

A q = ql(l ~ <7o) 

= 0.0004 X 0.98 
= 0.000392 
q\ = <lo + At? 

= 0.020392 

q\ = 0.0004158 or 1 in 2,405 

The incidence would be increased by 4 per cent of its original level. There would 
be 16 additional cases per million births. 

63 (3.7) (1) The gene frequencies in the sample are 0.6 and 0.4, from which 
the Hardy—Weinberg frequency of heterozygotes is 0.48. The observed frequency 
is in excess of the expectation, suggesting some form of selection against one or 
both homozygotes. (2) With the sample coming from so few parents, chance dif¬ 
ferences of gene frequency between male and female parents become important. 
Here N = 8, and the expected frequency of heterozygotes, by [3.16], is 

H = 0.48(1 + 1/16) = 0.51 

This is close to the observed frequency and so the evidence for selection disappears. 
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64 (5.7) This can be worked out by pedigree analysis, using [5.1], or by 
coancestries, using the rule in [5.3]. But it is simpler to get the inbreeding coef¬ 
ficient directly from the frequency of homozygotes. This gives the inbreeding 
coefficient by definition, because the genes come from highly inbred lines and all 
homozygotes must therefore be homozygous for alleles that are identical by descent. 
Call the inbred lines A and B. The F 2 individuals, taken all together, produce A 
and B gametes in equal proportions, as do the Fj. So the backcross to the Fj pro¬ 
duces jAA + |BB, and the backcross to A produces j AA. Both therefore 
produce 50 per cent of homozygotes and the inbreeding coefficient of both progenies 
is 0.5. 

65 (20.2) Let Y denote IQ score and W denote fitness as measured by family 
size. The change in IQ score is predicted by [20.5b]: 

CRy = 

Taking h 2 W to be 0.1 gives 

CR y = 0.11 x V(0.6 x 0.1) x 15.4 x 2 3 
= 0.95 

Taking h w to be 0.2 gives CRy = 1.35. The predicted change is an increase of 
about one IQ point per generation. 

The correlated selection differential is given by [20.4] as the phenotypic covariance: 

S' = COVp = FpOyOw 

= 0.11 x 15.4 X 2.3 
= 3.9 IQ points 

66 (7.7) The breeding values and dominance deviations are given in Table 
7.3. In solving Problem 7.3 we found a = 10, d = 50, q = 0.4. ( q is the frequency 
of the reducing allele as required.) The average effect of the gene substitution must 
first be found from [7.5]. This is 

a = 10 + 50(0.4 - 0.6) = 0 

All the breeding values are therefore zero. (The reason for this is that the popula¬ 
tion is at its maximum mean value, which would be its equilibrium if the character 
were fitness.) Substitution of p, q, and d gives the dominance deviations as 

AjAj A|A 2 A 2 A 2 
-16 +24 -36 

Check by seeing that the mean dominance deviation of individuals in the population 
is zero. 

67 (12.1) The calculations are shown in the table. The following explanations 
may be needed. 

a A = ha P = V(0.2) x 2.10 = 0.94 

Total response at the limit (one-way), R = 23.55 - 11.90 = 11.65. 
Half-life: This can be deduced as follows. Half the total response is 5.8. 
Given S = 2.25 and h 2 = 0.2, the response per generation by [11.2] is 
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2.25 X 0.2 = 0.45. Given that the response was linear, the number of 
generations required to give the response of 5.8 is 5.8/0.45 = 13. This 
estimates the half-life. 

Theoretical maximum response, by [12.2] noting that ia P = selection 
differential = 2.25, = 2 X 33 X 0.2 X 2.25 = 29.70. 

Theoretical maximum half-life, assuming all genes to be additive, is 
1.4 N e = 46. 


The number of loci and their standardized effects in and 2 a/o P ) cannot be 
calculated without knowing the lower selection limit. One might guess that downward 
and upward selection would have produced equal responses, making the total range 
R = 23.3; the figures in parenthesis are based on this assumption. The assumption, 
however, makes the growth at the lower limit improbably small, i.e. 11.90 - 11.65 
= 0.25 g. A more reasonable guess would be that downward and upward selection 
would have produced the same proportional change; upward selection doubled the 
growth, so downward selection would have halved the growth, making the lower 
limit 5.95 and the total range R = 17.6. The first entered figures are based on this 
assumption. Remember, however, that estimates of gene numbers depend on several 
other assumptions that are unlikely to be true. 


One-way response, H/C = 23.55/11.90 

— 

2.0 


R/o a = 11.65/0.94 

= 

12.4 


R/o P = 11.65/2.10 

= 

5.5 


= 29.70/2.10 

= 

14.1 

N e 

(given) 

= 

33 

Duration 

(given) 

— 

34 

Half-life 

(deduced) 

= 

13 

Half-life/A/, 

= 13/33 

= 

0.39 

Observed 





: Response = 11.65/29.70 

= 

0.39 

Maximum 





Half-life = 13/46 

= 

0.28 


(17.6) 2 



No. of‘loci f 12.11: n — , 

= 

44 (77) 


8 x (0.94) 



2a/Op 

= 2hs/(2ln) 

= 

0.19 (0.14) 


68 (10.3) Since we are working with correlations and not covariances the 
components will all be proportions of the total phenotypic variance V P . The pater¬ 
nal half sibs give an estimate of 

V A /V P = 4 x 0.140 = 0.560. 

In view of the standard errors this is not inconsistent with the estimate from the 
regression of children on fathers in Problem 10.2. The environmental variance com¬ 
mon to children of the same mother is estimated from the difference between the 
maternal and paternal half-sib correlations: 

V Ec /V P = 0.257 - 0.140 = 0.117. 

Environmental resemblance between children of the same mother is to be expected 
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in consequence of the environmental resemblance between children and mothers found 
in Problem 10.2. The two covariances are nearly the same, though this is not a 
necessary part of the expectation. The full-sib correlation estimates (kV A + \V n + 
VecW p , so 

WrJVp = 0.406 - 2(0.140) - 0.117 = 0.009 
V D IV P = 0.036 

That this is not significantly different from zero can easily be seen without working 
out its standard error. The remaining component, environmental variance within 
sibships, is got by the difference from the total of 1. The full partitioning, in percen¬ 
tages, is 

Va + V D + V Ec + V Ew = V P 
56 + 4 + 12 + 28 = 100 

69 (14.1) The change of mean is given by [14.4] as -2FL dpq. After one 
generation of selflng F = 0.5 (Table 5.1), so the inbreeding depression due to the 
four loci is L dpq , where d is the difference between the heterozygote value and 
the mid-homozygote value (Fig. 7.1). The mid-homozygote value is given below 
as the difference from the AA value, and is one half of the ‘aa’ value in the table 
of data. 


Locus 

Mid-hom. 

value 

d 

pq 

dpq 

(1) 

-10 

0 

0.25 

0 

(2) 

-15 

+20 

0.25 

5 

(3) 

-15 

-5 

0.16 

-0.8 

(4) 

-30 

+30 

0.09 

2.7 




L dpq = 

6.9 


There would be a reduction in yield of 6.9 g due to these loci. 


70 (19.4) The realized heritabilities, by [11.7], are as follows. 

h 2 Q = 186/574 = 0.32 
h 2 ¥ = 525/1312 = 0.40 


The genetic correlation, by [19.7], is 


r 2 
r A 


r A = 


120 412 
186 525 
± 0.71 


0.5063 


Because the direct and correlated responses are in the same direction the sign of 
r A must be positive, so r A = +0.71. 


71 (1.8) Hardy—Weinberg frequencies are not expected because the gene 
frequency was different in the male and female parents. The gene frequency of so 
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in males was 16/20 = 0.8, and in females 4/20 = 0.2. Putting these as the gametic 
frequencies in Table 1.2 gives the genotype frequencies in the progeny as 

so/so so/cn cn/cn 
0.16 0.68 0.16 

With a gene frequency of 0.5 the Hardy—Weinberg expectations are 0.25, 0.50, 
0.25. Note the excess of heterozygotes resulting from the unequal gene frequencies 
in male and female parents. 


72 (2.7) Let q be the frequency of the mutant allele, so that qo = 1. Let m be 
the proportion of immigrants (m = 0.01) with q m = 0. Then [2.1] gives 

Q\ = mq m + (1 “ m)q 0 

= (0.01 x 0) + (0.99 x 1) = 0.99 
q 2 = (0.01 X 0) + (0.99 X 0.99) = (0.99) 2 
q ]0 = (0.99) 10 = 0.9044 

Frequency of wild-type allele = p = 1 —0.9044 = 0.0956. 

After the last immigrants have bred the genotypes will be in Hardy—Weinberg 
proportions. Therefore 

(1) Frequency of wild-type flies = p 2 = (0.0956) 2 = 0.0091 

(2) Frequency of mutant phenotype = 1 — p 2 = 0.9909 
Frequency of heterozygotes = 2 pq = 0.1729 

Frequency of heterozygotes among mutants = 0.1729/0.9909 = 0.1745 


73 (3.8) Start from the expected frequency of heterozygotes, given as 
H = 2pq + k = 2 pq + j o 2 D 

where D is the difference of gene frequency between the male and female parents. 
Let there be M male and F female parents, with 2 M and 2 F genes sampled. The 
binomial sampling variances of the gene frequencies are pqtlM in male parents, 
and pqllF in female parents, where p and q are the overall gene frequencies in the 
whole population. The sampling variance of the difference of gene frequency is 

a\ = pq/2M + pql2F 


The modified equation is therefore 


H = 2pq + pq 


= 2 pq 


1 + 


-L + _L) 

4Af 4 F ) 

- + —) 
8M 8F / 


74 (5.8) Think of the part of the population made up only of cousin marriages. 
This is a subdivided population with F = 1/16. (See Problem 5.1.) The risk to the 
children is the frequency of homozygotes, given in Table 3.1 as q 2 + pqF. Call 
this risk Q. To evaluate Q we need the gene frequency, q. Since most marriages 
in the whole population are non-consanguineous we can take q to be approximately 
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the square root of the incidence. (There is a small error in doing this because, with 
some cousin marriages, the genotypes are not quite in Hardy-Weinberg propor¬ 
tions; see Problem 5.9.) Finally, the risk to the children of cousin marriages relative 
to the population as a whole is R = Q/I , where / is the population incidence. 


(1) Cystic fibrosis (2) PKU 


Incidence, /(= q 2 ) 

4 

X 

10 ~ 4 

90.91 

X 

10 “ 6 

q = V/ 

2 

X 

10" 2 

9.53 

X 

10“ 3 

Fpq ^ 

12.25 

X 

10 _4 

590 

X 

10~ 6 

Q = q 2 + Fpq 

16.25 

X 

10 “ 4 

681 

X 

10“ 6 

R = Q/I 

4.1 



7.5 




The risk is increased 4-fold for cystic fibrosis and 7Hold for PKU, but in absolute 
terms the risks are still small. 

75 (15.1) Base population parameters: 

V P = 4.0; ]/ G = V A = 0.52 x 4.0 = 2.08; V E = 4.0 - 2.08 = 1.92 

This is slow inbreeding so we use column (1) of Table 15.1. For F = 0.5 this gives 



Genetic 

Environmental 

Phenotypic 

Between lines 

2 FV g = 2.08 

0 

2.08 

Within lines 

(1 - F)V g = 1.04 

1.92 

2.96 

Total 

(1 + F)V g = 3.12 

1.92 

5.04 


Heritabilities: (1) within lines 1.04/2.96 = 0.35 
(2) overall 3.12/5.04 = 0.62 

76 (7.8) The relevant values found for the two genes were 

Gene b: a = d = 2.5; q = 0.5 

Gene c e : a = d = 28.5; q = 0.2 

First get the average effect of each gene substitution from [7.5]: 

Gene b: a = 2.5 + 2.5(0.5 - 0.5) = 2.5 

Gene c e : a = 28.5 + 28.5(0.2 - 0.8) = 11.4 

The genotype we are concerned with corresponds to A 2 A 2 in Table 7.3. Substituting 
the above values for each locus separately gives the following. 


Breeding value — —2 pa. _ Dominance deviation = —2 p 2 d 

Gene b: -2 x 0.5 x 2.5 = -2.50 -2 x (0.5) 2 x 2.5 = -1 25 

Gene c e : -2 x 0.8 x 11.4 = -18.24 -2 x (0.8) 2 x 28.5 = -36 48 

Both genes: _ -20.74 -37 73 


Adding the breeding values of the separate loci gives the breeding value of the joint 
genotype, similarly for the dominance deviation. The breeding value calculated is 
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the deviation from the population mean, which was found to be 91.47 granules. In 
absolute units, therefore, the breeding value is 91.47 - 20.74 = 70.73 granules. 

77 (13.1) The heritabilities are calculated by [13.4] and [13.5], given also 
in Table 13.4. The relative responses are calculated from the expressions in the right- 
hand column of Table 13.4. 



*/ 

[13.4] 

hi 

[13.5] 

Rf/R 

R w /R 

(1) 

0.265 

0.077 

0.93 

0.72 

(2) 

0.390 

0.077 

1.06 

0.74 

(3) 

0.100 

0.100 

0.79 

0.61 

(4) 

0.147 

0.500 

0.68 

0.97 

(5) 

0.136 

0.500 

0.62 

1.05 


Note that, as can readily be seen from [13.4] and [13.5], when t = r, hj = h 2 = 
h 2 . In these circumstances individual selection gives the best weighting of the 
individual and the family mean. Note also the circumstances that make family selection 
or within-family selection better than individual selection. 

78 (10.4) The expectations of the mean squares are 

B = ay, + 2 a 2 B 
W= a 2 w 

from which 

ol = (B - W)12 
°w = W 

The intraclass correlation is 
t = a|/(a| -(- o 2 w ) 

= - w) 

~ W) + W 

- B - w 
B + W 

79 (14.2) The population mean, by [7.3], is M = L a(p - q) + 2E dpq. This 
is a deviation from the multiple mid-homozygote value. The mean with the favourable 
alleles all homozygous will be E a, also a deviation from the multiple mid-homozygote 
value. Therefore the increase will be 

E a - M 

= L a — L a(p — q) — 2 L dpq 
= Z[a(2q)] —2D dpq 
= 2(D aq — D dpq) 
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We need to know the value of a for each locus. This is half the difference between 
the homozygote values. The value of £ dpq was obtained in Problem 14.1 as 6.9. 


Locus 

a 

aq 

(1) 

10 

5 

(2) 

15 

7.5 

(3) 

15 

3 

(4) 

30 

3 

£ aq 

= 

18.5 


Increase = 2(18.5 — 6.9) = 23.2 g. 

80 (19.5) The solution comes from [19.9] with selection for body weight 
giving the correlated response of litter size. The intensities of selection correspond¬ 
ing to the proportions selected, from Appendix Table A, are 



Selection for 


Utter size ( X ) 

Body weight (7) 

i on females 

1.271 

1.271 

i on males 

0 

1.755 

i (mean) 

0.6355 

1.513 


Substitution into [19.9] gives 


CR 

R 


0.43 


1.513 

0.6355 


X 


1.29 


0.35 X 

0.22 / 


Selection for body weight is expected to be 29 per cent more effective than selection 
for litter size, mainly because males can be selected. 


81 (1.9) It can easily be shown to be true by ‘trial and error’ with a small 
number of alleles. (Calculate the frequency of heterozygotes with three alleles all 
at the same frequency; then recalculate with unequal frequencies.) A simple general 
proof comes from consideration of the variance of the allele frequencies. With equal 
frequencies the variance is zero. Let n be the number of alleles and q the frequency 
of any allele. Then £ q = 1 and (£ q) 2 = 1. The frequency of homozygotes is 
£ q, and when this is minimal the frequency of heterozygotes is maximal. The 
variance of q is given by 


2 1 
°q = 

[e q 1 ( E «> 2 ] = ± 

L 2 1 ] 

£ q - 

n 

n J n 

n 


Rearrangement leads to 



Solutions 


389 


£ q 2 = nal H- 

n 


Therefore, with any value of n, the frequency of homozygotes is minimal when 
a q = 0. All alleles then have equal frequencies of 1 In. 

Substituting a 2 = 0 into the above equation gives the frequency of homozygotes 
as 1/rc. Therefore the frequency of heterozygotes is 1 — (1 In). 

Note that the first equation above can be written in the following useful form: 


a 


2 

<? 


Zq 1 


n 


= q 2 ~ (q ) 2 


i.e. variance = (mean of squares) — (square of mean). This is used in later chapters. 


82 (2.8) Because both populations reached the same, intermediate, gene 
frequency selection must have favoured heterozygotes. This is to be expected because 
the ‘heterozygotes’, so+l+cn, are wild-type and both homozygotes are mutant. The 
relative magnitude of the selection coefficients against the two homozygotes can be 
found by [2.18]: 

5 (W) = P(cn) _ 0-65 _ J 

S(cn ) q(so) 0.35 


83 (4.1) The effective population size is given by [4.4£>], Then get AF by [4.1] 
and the inbreeding coefficient by [3.12]. 


No. of females 

10 

10 

10 

10 

No. of males 

10 

5 

2 

1 

N e 

20 

13.33 

6.67 

3.636 

AF 

0.025 

0.0375 

0.075 

0.1375 

F{t = 10) 

0.224 

0.318 

0.541 

0.772 


84 (5.9) The incidence in the non-inbred individuals is q 2 , and in the inbred 
idividuals is q 2 (1 — F) + qF, from the right-hand side of Table 3.1. The overall 
icidence is therefore 

/ = (1 - y)q 2 + y{q\ 1 - F) + qF] 
y multiplying out the brackets this reduces easily to 
(1 — yF)q 2 + yFq = I 
rhich can be solved for q if y and F are known. 

Note that yF is the average inbreeding coefficient in the population as a whole, 
tid the expression for the overall incidence can be got immediately from Table 3.1 
y putting yF in place of F. 

The equation can be used to get the exact solution to Problem 5.8, assuming there 
re no other causes of departure from Hardy—Weinberg proportions. The frequency 
f cousin marriages varies a lot, but is about 1 per cent in many populations. 
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85 (15.2) This is rapid inbreeding so we should use column (2) of Table 15.1, 
with F — 0.5,/ = 0.594 (from Table 5.1). This gives the components on the left 
below. We need, however, the variance of observed means of lines estimated from 
four individuals, which is a\ 4- (Table 13.3). This gives the values on the 
right below. 



Components 

Variance of observed means 


Genetic Phenotypic 

Genetic 

Phenotypic 

Between, a\ 

2.47 2.47 

2.63 

3.11 

Within, a 2 w 

0.65 2.57 

- 

— 


The heritability of observed line-means is 2.63/3.11 = 0.85. The intensity of selec¬ 
tion for/? = 5 per cent is i = 2.063 from Appendix Table A. The expected response 
is given by (13.2], the lines being equivalent to families, It is 

R = 2.063 X 0.85 X V(3.11) = 3.1 bristles. 

(If worked as if for slow inbreeding, R = 2.9 bristles: not very different.) 

86 (7.9) To get the population mean we need the frequencies of the four 
genotypes as shown in table (i). Multiply the value of each genotype in Example 
7.7 by its frequency and add to give the population mean = 1.112. Convert the 
values to deviations from the mean as in table (ii). These deviations from the mean 
will now be referred to simply as values. Next, we have to look at each locus 
separately and find the mean value of each of its two genotypes in this population. 
These are given in the margins of table (ii); for example, the mean value of the 
C— genotype is 0.6(0.328) + 0.4( —0.342) = 4- 0.060. Now get the additive ex¬ 
pectations of the combined genotypes as in table (iii). These are the values the com¬ 
bined genotypes would have if the values of the two single-locus genotypes were 
simply added together. For example, the expectation for the B- C- genotype is 
0.228 (for B —) 4- 0.060 (for C —) = 0.288. Finally, the interaction deviation of 
each genotype is the difference between the observed value in table (ii) and the additive 
expectation in table (iii). These are given in table (iv). For example, the interaction 
deviation of B — C— is 0.328 — 0.288 = -I- 0.04. To check, see that the mean 
interaction deviation is zero. 


(i) Frequencies (ii) Observed deviations from mean 


Freq. 

B- bb 

0.6 0.4 Freq. 

B- bb 

0.6 0.4 

Mean 

C- 0.8 
c e c e 0.2 

0.48 0.32 C- 0.8 

0.12 0.08 c e c e 0.2 

0.328 -0.342 

-0.172 -0.342 

40.060 

-0.240, 

Mean 

40.228 -0.342 

0.000 


(iii) Additive expectations (iv) Interaction deviations 



B- 

bb 


B- 

bb 

C- 

40.288 

-0.282 

C- 

40.04 

-0.06 

c e c e 

-0.012 

-0.582 

c e c e 

-0.16 

40.24 
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The interaction deviations can also be calculated directly from the table of genotypic 
values as follows. The means in the margins are obtained as in table (ii) above. 



B- 

bb 

Mean 

c- 

1.44 

0.77 

1.172 

c e c e 

0.94 

0.77 

0.872 

Mean 

1.34 

0.77 

1.112 


Then, taking the B— C— genotype as an example, the steps followed above in 
calculating the interaction deviation can be summarized as 

(1.44—Af) - [(1.34—Af) + (1.172-M)] 

where M is the population mean = 1.112. The interaction deviation of this genotype 
then becomes 

1.44 - 1.34 - 1.172 + 1.112 = 0.04 


87 (13.2) The responses relative to individual selection, R, are got from the 
expressions at the right-hand side of Table 13.4. (1) and (2) are family selection; 
(3) is sib selection; (4) is within-family selection. 


(1) R f /R = 

(2) R f /R = 

(3) RJR = 

(4) RJR = 


1 + (4 x 0.25) 

V[5[1 + (4 X 0.10)]} 
1 + (4 x 0.5) 
V[5[l + (4 x 0.36)]} 
5 x 0.5 

V{5[1 + [4 x 0.36)]} 
4 

(1 - 0.5) 


0.76 


= 0.86 


= 0.72 


5(1 - 0.36) 


0.56 


88 (10.5) This illustrates some of the difficulties in interpreting twin data. The 
heritability can be estimated in three ways with different biases, as shown in the 
table below. All are biased, but in different amounts, by dominance and by epistatic 
components not shown in the table. (2) and (3) are biased by common environment. 
The children under 10 make reasonably good sense. The degree of genetic deter¬ 
mination is estimated approximately by (1) as 52 per cent. There is resemblance 
due to common environment, which can be estimated approximately by subtracting 
(1) from (2), or equivalently by subtracting (2) from (3), as V Ec = 12 per cent. The 
children aged 10—15 do not make good sense. It appears that Vec does not con¬ 
tribute to their resemblance because (2) and (3) are less than (1). 




Under 10 

10-15 

(1) 2(MZ—DZ) 

= (V A + \ W D )/V P 

0.52 

wm 

(2) MZ 

= (Ya + v D + V Ec )/V P 

0.64 

Eni 

(3) 2 DZ 

= (V A + W D + 2V EC )/V p 

0.76 
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89 (14.3) The experiment gave two values, 8.1 and 8.5, for the mean with 
F = 0. The solution will be based on the mean of these, i.e. 8.3. Let D M and D L 
be the depression due to inbreeding in the mothers and the litters respectively when 
F = 1. Then from the third line of the table in Example 14.2 

0.5 D m = 8.3-6.2 
= 4.2 

From the second line of the table 

0.5 D l + 0.375 D m = 8.3-5.7 
D l = 2.05 

Inbreeding in the mothers caused about twice as much depression as inbreeding in 
the litters. The total depression at F = 1 would be 

D M + D L = 4.2 + 2.05 = 6.25 

and the mean litter size would be 8.3-6.25 = 2.05 young per litter. 

90 (19.6) First work out the expected response to selection of females only, 
which was not calculated in the solution to Problem 19.5. We can assume that 25 
per cent are selected out of a large number, so the intensity of selection is taken 
from Appendix Table A. By [11.3] the response is 

R = k x 1.271 x 0.22 x o P = 0.140 o P 

For the joint selection we calculate the response expected from the direct selec¬ 
tion for litter size in females by [11.3] and the correlated response from selection 
for body weight in males by [19.6], and then add the two expected responses together. 
This is equivalent to calculating the mean breeding value for litter size of the selected 
females and males. The units throughout are phenotypic standard deviations of lit¬ 
ter size. The intensity of selection on females has to be taken from Appendix Table 
B because all are selected from small samples, i.e. 1 out of 4. The predicted responses 
are 


From females: R = y X 1.029 x 0.22 = 0.113 o P 

From males: CR = k X 1.271 x V(0.35 X 0.22) x 0.43 = 0.076 o P 

From both: Joint response = 0.189 o P 

The joint response relative to the response expected from selecting females only is 


0.140 

The third procedure would be 35 per cent better than the first. 

91 (1.10) The working comes from Table 1.5 and equation [1.5]. AA bb is 
produced by union of two recombinant gametes of type Ab, whose frequency, s, 
we therefore have to find. (Ab corresponds to A[B 2 in Table 1.5.) We need to know 
the following quantities, (i) The gene frequencies; these are 0.5 at both loci, (ii) 
The equilibrium frequency, s, of Ab gametes; this is s = p A q& = 0.25. (iii) The 
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disequilibrium measure, D 0 , in generation 0, before any recombination has taken 
place, and (iv) the disequilibrium, D 2 , in generation 2. The disequilibrium in any 
generation, calculated from the Ab gamete frequencies, is — D = s — s. In genera¬ 
tion 0, s = 0 so D 0 = s = 0.25. D 2 is got from [1.5]; the generation 2 progeny 
are the product of 2 generations of recombination, so t = 2 and D 2 = D 0 (l —c) 2 . 
With free recombination in question (1), c = 0.5 and D 2 = 0.25(0.5) 2 = 0.0625. 
With c = 0.2 in question (2), D 2 = 0.25(0.8) 2 = 0.16. Next we return to the equa¬ 
tion — D = s — s given above. Writing this as s = s — D 2 and substituting the 
values obtained for s and Z> 2 we get for generation 2, 

(1) s = 0.25 - 0.0625 = 0.1875, and (2) s = 0.25 - 0.16 = 0.09. 

Finally, the frequency of A A bb in the progeny produced by these gametes is s 2 . 
The answers are therefore (1) 0.0352 and (2) 0.0081. 

92 (2.9) Let R be the resistance gene with frequency p, and S the susceptible 
allele; let s x be the selection coefficient against RR, given as s j = 0.63. To find 
the proportion of rats that die as a result of the poisoning we have to find s 2 , the 
selection coefficient against SS. By [2.18] 

j*2 = P 

s x \ - p 

0.34 

s 2 = 0.63 X -- = 0.32 

0.66 

The proportion of deaths is 

RR SS 

stf 2 +J 2 (1 ~ P ) 2 

=0.63(0.34) 2 +0.32(0.66) 2 
=0.07 +0.14 

=0.21 

21 per cent of all rats die. This can be got more directly by [2.21] which gives the 
total deaths, i.e, the load, as 

L = 0.63 x 0.34 = 0.21 


93 (4.2) Substitute N m = Nf/d into [4.4a] to give 

1 d 1 

— — - + - 

N e 4N f 4 N f 

d + 1 
4 N f 


d + 1 
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94 (18.1) The required values of x and i are got from Appendix Table A, 
with interpolation. The correlations, t , are calculated from the approximate formula 
[18.1]. The values of r come from Table 9.3. The heritability is estimated by [18.2]. 
Taking first-degree relatives as an example, 

t = (3.090 - 2.061)/3.367 = 0.306 
h 2 = It = 0.61 



p% 

X 

/ 

t 

r 

h 2 % 

Population 

Relatives 

0.1 

3.090 

3.367 

— 

— 

— 

1st degree 

1.97 

2.061 

— 

0.306 

1 

2 

61 

2nd degree 

0.34 

2.706 

— 

0.114 

1 

4 

46 

3rd degree 

0.19 

2.895 

— 

0.058 

1 

8 

46 


The estimate from first-degree relatives is likely to be the most precise, i.e. to 
have the smallest standard error, because the number of affected relatives is greatest 
and because the standard error of t is multiplied by 2 rather than by 4 or 8. On 
the other hand, first-degree relatives may have some environmental correlation 
through intra-uterine maternal effects. The most reliable estimate is probably that 
from second-degree relatives. Approximate standard errors can be calculated from 
the expression for the sampling variance of t given in the paragraph below equation 
[18.2], the standard error being the square root of the sampling variance. Multiply¬ 
ing the standard error of / by 1/r gives the standard errors of the three estimates 
of h 2 as 4, 10, and 26 per cent respectively. The estimates from first- and second- 
degree relatives are not significantly different from each other, and the estimate from 
third-degree relatives is not significantly different from zero. 


95 (15.3) The first generation were full sibs, making F = 0.25. Assume that 
there was no further inbreeding and that non-additive genetic variance is negligible. 
Then rearranging [15.1] leads to 


hi = 


h] 


1 - F(l - h 2 ) 

Substituting h 2 = 0.34 gives the heritability in the base population as 


h 


2 

0 


0.34 

1 - (0.25 x 0.66) 


0.41 


96 (16.1) The predicted yield of a three-way cross is the mean of two single 
crosses. We therefore have to look for the best two single crosses in which three 
varieties are involved. The single-cross yields, in order of merit, are 

Cross AE AC BC AB DE BE 

Yield 31.8 22.8 16.5 14.1 13.1 12.4 
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The best two involve three varieties and so are suitable. The variety appearing in 
both would have to be used for the second cross, so the cross would be (C x E) 
x A, and its predicted yield is KAC + AE) = 27.3. 

The predicted yield of a four-way cross is the mean of four single crosses. We 
therefore have to look for the best four single crosses with four varieties each involved 
in two of the crosses. These are AE, AC, BC and BE. The cross would be made 
as (A x B) x (C x E), and its predicted yield is KAC + AE + BC + BE1 
= 20.9. ' 


97 (13.3) The index required is for the individual and mean of sibs. The 
coefficients of the additive and the phenotypic components of the variance of observed 
family means in Table 13.3 are denoted by k and K respectively. 


1 + (4 x 0.5) 
5 

1 + (4 x 0.36) 
5 


0.600 

0.488 


The weighting factors in the index are 


^ h 2 ( 0.400) , A 2 (0.112) 

b\ = -; b 7 = ---— 

0.512 0.250 

and the index is 

/ = /i 2 (0.781)P 1 + h 2 (0,44$)P 2 

where P] is the individual s gain and P 2 is the family mean, both being deviations 
from the population mean. 

For convenience in application the index can be rescaled as 
/' = P, + 0.574/> 2 

The rescaled index can also be calculated from [13.8]. 

Note that if the units of the index are to be units of weight gain, then P 2 in the 
rescaled index, and both P , and P 2 in the unsealed index, must be deviations from 
the population mean. 


98 (10.6) We first have to calculate the observational components as shown 
in Table 10.3. Putting d = 3, k = 10, tlje me^n s^uares of males are 


3.894 — djy + lOfff) + 30(t| 
2.198 = o\y + 10<rjr) 

1.125 = a 2 w 


From these equations, and the corresponding ones for females, the observational 
components and the correlations are as follows. 
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Components 

Males 

Females 

Between sires, o 2 s 

0.05653 

0.0800 

Between dams, o 2 D 

0.1073 

0.1168 

Within dams, o 2 w 

1.125 

0.893 

Total a\ 

1.28883 

1.0898 

Correlations 

Half-sib, o\la\ 

0.0439 

0.0734 

Full-sib, (o 2 s + o 2 d )/o 2 t 

0.t27t 

0.1806 


The relationship between observational and causal components is given in Table 
10.4. The estimates of the causal components are as follows. 


Bristle units Per cent of total 




Males 

Females 

Males 

Females 

V A 

II 

0.2261 

0.3200 

17.5 

29.4 

iy 0 + v* 

= o 2 d ~ 

0.0508 

0.0368 

3.9 

3.4 

iVo + Ye, 

= o 2 w - 2o\ 

1.0119 

0.7330 

78.5 

67.3 

Vp 

_ 2 
— O y 

1^2888 

1.0898 

99.9 

100.1 


' Without maternal half sibs V Ec cannot be separated from ty D , nor V Ew from \V D . 

99 (14.4) The predicted depression is 0.56 D M + 0.64 D L . Taking the values of 
D m and D l calculated in Problem 14.3 for the non-inbred mean of 8.3, this gives 1 
the depression as 

(0.56 X 4.2) + (0.64 X 2.05) = 3.66 

The means in Fig. 14.2(a), read from the graph, are approximately 7.6 at F = 9 
and 4.2 at the last generation, giving a depression of 3.4 young per litter. This agrees 

very well with the prediction. 1 

\r 

100 (19.7) Let subscripts 1 denote growth (G) and 2 denote food consumption 
(F). The index equations from [19.11] are 

b\P\\ + b 2 P\2 = A u 
b\Pi\ + b 2 P22 = ^21 

Substitute the given parameter estimates into [19.12]: 

P u = (1.11) 2 = 1.2321 

P 22 = (2 AS) 2 = 6.1504 

p l2 = p 2l = 0.83 X 1.11 x 2.48 = 2.2848 

A n = 0.52 X (l.ll) 2 = 0.6407 

A 2l = 0.71 x V(0.52 x 0.40) x 1.11 x 2.48 = 0.8914 
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Substitute these values into the index equations: 



1.2321 b x 

+ 2.2848 b 2 = 0.6407 

(1) 


2.2848 b x 

+ 6.1504 b 2 = 0.8914 

(2) 

(1) X 6.1504: 

7.5779 b x 

+ 14.05 b 2 = 3.9406 

(3) 

(2) x 2.2848: 

5.2203 b x 

+ 14.05 b 2 = 2.0367 

(4) 

(3) -(4): 

2.3576 b x 

= 1.9039 




= 0.8076 


Substitute b x into (1): 






= -0.1551 



The index is 


I = 0.808G - 0.155F 

Or, rescaled to give unit weight to G, in the form 
I' = G + WF 
where W = b 2 /b x , it is 
/' = G — 0.192F 

101 (1.11) The single-locus genotypes are now not in Hardy—Weinberg propor¬ 
tions in the generation 0 progeny, because the gene frequencies are different in the 
male and female parents. This affects the disequilibrium in generation 1 which must 
therefore be worked out first. Recombinant gametes are produced only by the AB/ab 
genotype. Here all the generation 0 progeny are of this genotype, so s x (the 
frequency of Ab gametes) is fc. (In Problem 1.10 only half of the generation 0 pro¬ 
geny were AB/ab, so s x was half as great.) The rest of the calculation is as follows. 



(1) 

c = 0.5 

(2) 

c = 0.2 

= k 

0.25 

0.10 

equilibrium, s 

0.25 

0.25 

D ] = s — A’, 

0 

0.15 

D 2 = D,(l - c), by [1.5], 

0 

0.12 

$2 = ^ Z?2 

0.25 

0.13 

s 2 (freq. of A A bb) 

0.0625 

0.0169 


There is less disequilibrium following the ‘cross’ than after the ‘mixture’ of the strains 
in Problem 1.10. With no linkage the two-locus equilibrium frequencies are attained 
in generation 1, which corresponds with the F 2 of a classical two-factor cross with 
genotype frequencies of 1/16. 

102 (2.10) Genotype frequencies among the parents (both genes) and gene fre¬ 
quency, q 0 , of mutant: 

AA Aa aa q Q 

0.6 0 0.4 0.4 
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Belief) gamete frequencies: 


A 

a 

Total 

0.6 

0.4 x 0.5 

0.8 

Divided by 0.8: 0.75 

0.25 

1.0 


Random^tiion among these gametes gives the genotype frequencies in the progeny: 

AA Aa aa 

0.5625 0.3760 0.0625 


The observed gene frequency in the progeny, by [1.1], is q { = 0.25. This is the 
same as in the gametes. Hardy—Weinberg expectations based on are therefore 
exactly as observed. The progeny alone give no evidence of selection. 

Gene (b): gamete frequency of mutant is q 0 = 0.4. With random mating the 
genotype frequencies in the progeny are 


BB Bb 

In zygotes 0.36 0.48 

In survivors 0.36 0.48 

Observed 0.391 0.522 

Observed gene frequency by [1.1]: q x 
expectations 


bb 

Total 

0.16 

1.00 

0.08 

0.92 

0.087 

1.000 

= 0.348, 

which gives Hardy—Weinberg 


0.425 0.454 0.121 


The observed frequencies have an excess of heterozygotes and a deficiency of both 
homozygotes, suggesting to the unwary that selection favoured heterozygotes. The 
progeny alone tell us only that the conditions for Hardy—Weinberg expectations have 
not all been met. A q is greater for gene (a) because there were no heterozygotes 
among the parents and all mutant genes were exposed to selection in homozygotes. 
With gene (b) many mutant genes were sheltered from selection in heterozygotes. 

103 (4.3) Remember that 500 breeding pairs means N = 1000. 

N e = 253 by [4.6]; 

A F = 0.20 per cent by [4.1]. 


104 (18.2) The required values of x and i from Appendix Table A are given below. 
The correlations, t, are calculated by [18.1]. The correlations are multiplied by 2 
to give the heritability because r = The two estimates agree very well. The 
‘repeat births’ are treated in the same way as relatives to give the correlation which 
is the repeatability. It is calculated as (1.812 - 1.282)72.208 = 0.24. 



p% 

X 

i t 

h 2 % 

Population 

3.5 

1.812 

2.208 - 


Mothers 

4.6 

1.685 

— 0.0575 

12 

Daughters 

4.8 

1.665 

— 0.0666 

13 

Repeat births 

10.0 

1.282 

- 0.24 

— 
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105 (15.4) One-quarter of the mutations occurring will be fixed in a sib-mated 
linei in which there are 4 gametes per generation. So the number of mutations found 
is the mutatibn rate per gamete. The number of generations by which the sublines 
wereiseparated is 21 + 29 = 50, and there were 27 characters. Therefore the total 
mutation fate-per gamete per generation per character is 

--- = 3.7 x 10" 3 

50 x 27 

This is the total mutation rate. If the mutation rate per locus is u and if each character 
can be affected by mutations at any of n loci, then 

3.7 x 10 -3 = nu 

We can take our choice as to what is the most credible combination of values for 
u and n, e.g. 


u n 

3.7 X 10~ 3 1 

3.7 X 10" 4 10 

3.7 x 10~ 5 100 


106 (16.2) Let A, B, and AB represent the performances of the two lines or breeds 
and the single cross or F,. 


Generation 

Genotypes 

crossed 

Expected performance of progeny 

1 

AA 

X BB 

AB 

2 

AB 

X AA 

{AB + A)l 2 

3 

AB-j 

X BB 

(3 AB + B)l 4 

4 

AAj 

3AB) 
BB j 

X AA 

(5AB + 3A)/8 


107 (13.4) We use the rescaled index in which P, is the individual weight gain 
and P 2 is the family mean, which must be expressed as a deviation from the popula¬ 
tion mean; b 2 is the weighting factor for P 2 . Problem 13.3 gave b 2 = 0.574 for 
families of n = 5. For n = 8, k = 0.5625 and K = 0.4400, giving b 2 = 0.636. 
The index values of the individuals are 

A: 1.6 + (0.574 X -0.2) = 1.485 

B: 1.5 + (0.574 X 0.1) = 1.557 

C: 1.5 + (0.636 X 0.1) = 1.564 

D: 1.3 + (0.636 X 0.2) = 1.427 

The order of merit is C, B, A, D. 
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Xpiote\ P 2 has to be a deviation from the population mean because its weighting 
(# 2 ) is not the same in all individuals. Pi does not have to be a deviation because 
its weighting is the same. It is sensible to have an index whose mean is equal to 
the population mean or to zero. If all measurements in the index are expressed as 
deviations, then the index itself is a deviation and its mean is zero. If one measure¬ 
ment is in actual units, then all others must be deviations and the index then has 
a mean equal to the population mean.) 

108 (10.7) Consider first the particular case of t = 0.75, h 2 = 0.5. Let V P = 
100. Then V A = 50 and cov FS = 75. 

cov FS = jV A -I- \V D -I- V Ec = 75 

Therefore 

W D + V Ec = 75 - 25 = 50 
Now, the phenotypic variance can be written as 
Vp= V A + (iV D + V Ec ) + \v D + v Ew = 100 
Therefore 

}V D + V Ew = 100 - 50 - 50 = 0 

So h 2 cannot be greater than 0.5 because neither V D nor V Ew can be less than zero. 

To generalize, consider the variance within full-sib families (see Table 10.4), which 
is 

\V A + W D + V Ew = Vp - cov FS = Vp( 1 - t ) 

Wa = Ml - 0 - W D - V Ew 

V A is therefore maximal when V D and V Ew are zero. The value of h 2 at its maximum 
is thus 

\h 2 = 1 - t 
h 2 = 2(1 - t) 

If the full-sib correlation is 0.8, the maximum possible heritability is 
h 2 = 2 X 0.2 = 0.4 

A character with V Ew = 0 is extremely improbable, especially if V Ec is not zero, 
so in practice the upper limit of h 2 will be substantially lower than 2(1 — t). 

109 (14.5) We shall calculate the average selection differential on females re¬ 
quired per generation. With N e = 40 the rate of inbreeding, by [4.1] was AF = 
0.0125. The inbreeding coefficient at generation 30, by [3.12], was F = 0.3143. 
With slow inbreeding like this it is a good enough approximation to take the in- 
breeding coefficient of mothers and litters as being the same. The depression from 
mothers and litters together will be taken from the solution to Problem 14.3 as be¬ 
ing 6.25 for F = 1. Therefore the expected depression at F - 0.3143 is 0.3143 
X 6.25 = 1.964, and the rate per generation is 1.964/30 = 0.065. Counteracting 
selection would be required to give a response of 0.065 per generation. Putting h 2 
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= 0.22 into [11.2], this makes S = 0.065/0.22 = 0.30. With no selection on males, 
the selection differential on females would have to be 0.6 mice per litter each genera¬ 
tion on average. This means that if, for example, 1 in 5 females in each generation 
were replacements subject to selection, they would have had to come from litters 
averaging 0.6 x 5 = 3 mice per litter above their generation mean. 

110 (19.8) The predicted response to selection for the index is given by [ 19.17], 
but we first need to find the variance of index values by [19.18], for which we need 
the unsealed values b x = 0.8076 and b 2 = —0.1551. By [19.18] 

a) = b x A n + b 2 A 2 i 

= (0.8076 x 0.6407) + (-0.1551 x 0.8914) 

= 0.3792 
Of = 0.6158 

The response to selection by the index, by [19.17], is 

Rf = 1.5775 x 0.6158 
= 0.971 

The units in which the index was calculated in Problem 19.7 were 100 g. Therefore 
the predicted improvement is 97 g per generation. 

The response to selection for growth alone was calculated in Problem 19.3. In 
100 g units it is R = 0.911. So the relative effectiveness of index selection is 


R 0.911 

The index is expected to be 6.6 per cent better. In this case the use of the secondary 
character gives little benefit. 


Ill (11.7) Both the intensity of selection and the generation length are now dif¬ 
ferent for males and females. Assume that the numbers of males and females are 
equal at maturity. The average number of lambs per ewe per season is 1.2 as stated 
in Problem 11.6. The number of male lambs per male parent per season is therefore 
iX 1.2 x 10 = 6. The males are bred for 2 seasons, so each has on average 12 
male offspring from which 1 must be selected, giving i = 1.840. The selection in¬ 
tensity on females for female replacements is i = 0.704 as calculated in Problem 
11.6. The generation length is 2.5 years for males and 3 years for females. The 
mean intensity of selection per year is therefore 


K1-840 + 0.704) 
K2.5 + 3) 


0.463 


The effectiveness relative to the optimal procedure in Problem 11.6 is 0.463/0.274 
= 1.69. It would be 69 per cent better. 

The reason why there is an optimal age for discarding parents is that L increases 
by equal steps but i increases by diminishing steps. The reason for the optimal age 
being lower is the greater intensity of selection on males. The optimal age for discard¬ 
ing both sexes depends on the mean i and the mean L , not on the ratio of i to L 
in each sex separately. 
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112 (2.11) The effect of selection in males and females is different and has to 
be worked out separately. The heterogametic sex will be referred to as male. Let 
q be the initial frequency of the recessive gene. The genotypes and their initial 
frequencies are shown below, with the gene frequencies among the survivors and 
in the next generation. For the gene frequency among surviving females see the 
derivation of [2.9]. 



Males 

Females 

A a 

p q 

AA Aa aa 

p 2 2pq q 2 

Gene frequency 



in survivors 

0 

qK i + q) 

in next generation 

qf( i + q) 

2 q( 1 + q ) 


The overall gene frequency among survivors of both sexes is 


2 q 

3(1 + q) 


(see [1.4]). 


In the zygotes of the next generation the gene frequency is no longer the same in 
the two sexes, but the overall gene frequency is not further changed. We can therefore 
take the above expression for as the overall gene frequency after one generation 
of selection. The change of gene frequency is then 


Aq = 


3(1 + q) 


- q 


which simplifies to 


A q 


g(l + 3g) 
3(1 + q) 


113 (4.4) The self-fertile variety breeds as an idealized population, for which N e 
= N\ AF = 1/2 N e = 2.50 per cent. With the self-sterile variety AF = 2.44 per 
cent by [4.2b]. Exclusion of self-fertilization makes little difference, and even less 
with larger N. 

114 (18.3) This form of analysis is equivalent to working on the ‘0, 1’ scale. 
It makes no difference if the values assigned to individuals are 0 and 1 rather than 
1 and 2. Each half-sib family has a mean which, with values 0, 1, is the proportion 
of its members that have twins. The covariance of half sibs is thus the variance of 
the proportion among half-sib families. On the 0, 1 scale, then, the heritability is 
4 x 0.0058 = 2.32 per cent. This can be converted to the liability scale by [18.4], 
We need the population incidence and the corresponding i, which from Problem 
18.2 are p = 0.035, i = 2.208. The heritability of liability is then 
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a] = ( b\ 4- b 2 k)h 2 a 2 A 
= 1.050 h 2 o 2 A 

The expected response to selection by the index is given by [13.15] as 
Rj = ioj = i(l.Q25)ho A 

This is directly comparable with the expected response to individual selection, R, 
in [11.4], and, provided the intensity of selection, i, is the same 

Rj/R = 1.025 

Index selection would be 2.5 per cent better than individual selection. The benefit 
is small because t is not very different from r. 

118 (10.8) All three regressions and the full-sib correlation are significantly dif¬ 
ferent from zero but, with the large standard errors, the partitioning of the variance 
can only be tentative. The regression of offspring on midparent estimates the 
heritability without bias from the assortative mating, giving h 2 — 0.82. Correcting 
the regressions on one parent for the bias caused the assortative mating (r = 0.33) 
by equation (14) of Table 10.6 gives h 2 = 2b!(\ + r) = 0.71 and 0.72 respec¬ 
tively. These are not inconsistent with the estimate from the midparent regression. 
To deal with the effect of assortative mating on the sib correlation we need to know 
the correlation of breeding values, m. Not knowing this, we might take m = r as 
an approximation. Then the correlation of full sibs in respect of breeding values, 
from equation (15) of Table 10.6, and taking h 2 = 0.82, is t = 0.41 x 1.33 = 
0.55. The observed correlation is higher than this, though not significantly, sug¬ 
gesting that there may be some environmental variance common to full sibs, or some 
dominance variance, amounting to 0.71 — 0.55 = 16 per cent of the total variance. 

119 (14.6) Let P be the mean yield of the two parents and let F ( , F 2 , F 3 repre¬ 
sent the yields of these generations. Then the predicted yields are 

F 2 = HP_+ F\) 

= HP + f 2 ) 

giving 



Observed 

Predicted 

Cross 

P 

Fi 

Fi F, 

(1) 

1.40 

1.41 

1.405 1.40 

(2) 

1.08 

1.42 

1.25 1.165 


The reasons for these expectations may_be made clearer by consideration of the 
heterosis, H. By definition, H ¥i = F\ - P. By [14.10], H ¥l = i H Fr The expected 
yield of the F 2 is 

P + H ?1 = P + 1 (Fj - P) = HP + F [) 

Each generation of selfing halves the frequency of heterozygotes (Table 5.1), so 
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the expected heterosis in the F 3 is^half that of the F 2 . Thus the predicted yield of 
the F 3 is P + k(F 2 - P) = k{P + F 2 ). 


120 (19.9) The correlated response is given by [19.19], but we first need to get 
the additive genetic covariance of character 2 with the index. The covariance of 
character 1 with the index is given by [19.20]. The covariance of character 2 with 
the index, which is given in Example 19.5, is 

cov 2 / = b 2 A 2 2 “I - b[A 2 y 

A 22 was not calculated in Problem 19.7. It is 0.40 X (2.48) 2 = 2.4602. 

COv 2/ = (-0.1551 x 2.4602) + (0.8076 X 0.8914) 

= 0.3383 

To get the response from [19.19] we need i = 1.5775 and oj = 0.6158, both com¬ 
ing from Problem 19.8. The correlated response of food consumption is then 


CR 2 = 1.5775 x 


0.3383 

0.6158 


= 0.87 (in 100 g units) 


The food consumption would be expected to increase by 87 g per generation. 


121 (11.8) Let n be the number of litters on which selection is based, and let 
t be the generation length in years. Let R and R n be the responses when selection 
is based on 1 and on n litters respectively. The responses can be compared by [11.3], 
R = ih 2 (j P , but it is simpler to do so by [11.4], R = iho A , because o A does not 
change with the number of litters. We do not need to evaluate the actual responses 
because the optimal value of n will be the number that gives the maximal value of 
the ratio 


*A = j±Kj_ 

RJt i h t n 

The calculations are shown below. Note that the expected number of females in each 
litter is 4, so the proportion of sows that must be selected is 1/4 with n = 1, 1/8 
with n = 2 etc. The ratio h 2 n lh 2 was calculated in Problem 8.6 for the repeatability 
of 0.409. The values of i are found from Appendix Table A. 


n t p% i h 2 n /h 2 i n /i hjh t/t„ - JLJ - 

R/t 


1 1 25 1.271 - - - - _ 

2 1.5 12.5 1.647 1.42 1.296 1.192 0.667 1.03 

3 2 8.33 1.840 1.65 1.448 1.285 0.500 0.93 


The optimal number of litters is 2. 
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122 (2.12) Assume that the gene frequency will be low enough for females 
homozygous for the deleterious gene to be ignored, as an approximation. Elimina¬ 
tion of the gene by selection then takes place only in males. One-third of the genes 
are in males, so the proportion sq/3 of the genes are eliminated by selection in each 
generation, where s is the coefficient of selection. At equilibrium these are replaced 
by mutation, so sq/3 = n, and q = 3u/s. 

The muscular dystrophy data fit this expectation very well with 5 = 1 and no selec¬ 
tion against heterozygous females. The gene frequency, neglecting the male—female 
difference, is the incidence in males, which is almost exactly three times the estimated 
mutation rate, as expected. 

123 (4.5) Let k = number of offspring used, i.e. family size, as given. The ac¬ 
tual number of parents, including the sterile pair, is N = 16. To get N e from [4.7] 
we need to calculate the variance of k. The mean is k = 2, and the deviations 
(k — k ) are: —2, —1, —1,0, 0, +1, +1, +2, from which Z(k — k) 2 = 12 and 
V k = 12/8 = 1.5. (Divided by 8, not 7, because this is the whole ‘population’ in 
the statistical sense, not a sample.) Then by [4.7] N e = 64/3.5 = 18.3. N e is still 
a little larger than N in spite of the failure to equalize family size. 

124 (18.4) Care is needed with the signs of jc and i. Here jc will denote the popula¬ 
tion mean as a deviation from the threshold so that, with incidences less than 50 
per cent, x is negative. The mean of affected individuals, i A , is a deviation from 
the population mean and is positive. We shall need also the mean of unaffected in¬ 
dividuals, i N , and this is negative. Appendix Table A provides values of i A for the 
incidence p, from which i N is got as i N = —i^p/il — p). The two generations have 
to be worked out separately because the change of incidence resulting from the first 
selection changes the selection differential in the second selection. 

Selection for reduced incidence will be explained first because it is simpler. The 
working is shown below. Since more than 50 per cent of individuals are normal 
all selected individuals of both sexes will be normal and the selection differential 
in standard deviation units will be S = i N . The response is R = h 2 S, and the new 
mean liability is x x = x 0 + R. To get the new incidence we have to find the value 
of p corresponding to jc t in Appendix Table A. The values given were obtained by 
interpolation, but it will make little difference if the nearest tabulated value of p 
is taken. For the second generation the calculation is repeated using p x and jtj in 
place of po and jc 0 . The prediction is that the incidence will be reduced to 17 per 
cent. 


Selection for reduced incidence 


First selection Second selection 


Po 

23% 

x ° 

-0.739 


1.320 

S = i N 

-0.394 

R = 0.3 S 

-0.118 

X] = x Q + R 

-0.857 

Pi 

19.6% 


Pi 

19.6% 

Xi 

-0.857 

U 

1.411 

S = i N 

-0.344 

R = 0.35 

-0.103 

x 2 - + R 

-0.960 

Pi 

16.9% 
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The calculation of selection for increased incidence, shown below, is the same 
as before except that the selection differential is more complicated. Since the pro¬ 
portion of selected males is less than the incidence, all selected males will be affected 
and the selection differential on males will be S a = i A . The selected females, 
however, will have to include some normals because fewer than 50 per cent are 
affected. In the first selection, the proportion of the selected females that are affected 
will be 0.23/0.5 = 0.46, and the proportion that are normal will be 0.54. The selec¬ 
tion differential on females will thus be S 9 = 0.4 6i A + 0.54 i N . Or, in terms of 
the incidence, S 9 — 2 pi A + (1 - 2 p)i N . The combined selection on both sexes 
will be + S 9 ). The rest of the calculation needs no further explanation. After 
the second selection the incidence will be increased to 41 per cent. The much greater 
response to selection for increased incidence results from the larger selection 
differential. 


Selection for increased incidence 


First selection 


Second selection 

Po 

23% 

Pi 

31.5% 

*0 

-0.739 

*1 

-0.482 


1.320 

i A 

1.128 

In 

-0.394 

(v 

-0.519 

• = (4 

1.320 

5c 

1.128 

S 9 = 2pi A + (1 - 2p)i N 

0.3944 

5 9 

0.5186 

S = + S a ) 

0.8572 

5 

0.8233 

R = 0.35 

0.257 

R 

0.247 

*1 

-0.482 

*2 

-0.235 

P\ 

31.5% 

P2 

40.7% 


125 (15.6) Let X be the true mean and M the observed mean of a cross. Selecting 
1 out of 50 gives i = 2.249 from Appendix Table B. The deviation of the best cross 
from the mean of all the crosses is therefore io M , where o M is the standard devia¬ 
tion of observed means. o 2 M was calculated in Problem 15.5. The expected mean 
of all crosses is the original population mean, which was 308. Adding 308 to io M 
gives the predicted observed mean of the best cross. 



F 

a! 



ia M 

M, predicted 

(1) 

0.5 

479 

745 

27.3 

61.4 

369.4 

(2) 

1.0 

1052 

1289 

35.9 

80.7 

388.7 


The prediction of future performance was explained in Chapter 8 in connection 
with repeatability. The expected mean of the repeated cross, the future performance, 
is the true mean of the cross. To predict the true mean we need to know the regres¬ 
sion of the true means on observed means. The observed mean is M = X + E, where 
E is the deviation due to sampling error. Now, 


C0V XM ~ COV X (x + E) = COVxx + COV XE 
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cov xx = a x an d co Vxe ~ 0 because X and E are uncorrelated. Thus cov XM = 
a\. The required regression is b XM = cov XM /a^ = o\fa 2 M . The rest of the cal¬ 
culation is as follows. 



bxM ~ °V a2 M 

Predicted true mean of best cross 

(1) 

0.643 

308 + 0.643(61.4) = 347.5 

(2) 

0.816 

308 + 0.816(80.7) = 373.9 


126 (16.4) In each case the right-hand column of the table in the solution represents 
the proportion of each genotype in the progeny. It is simplest to consider the pro¬ 
portion of homozygotes, which is the inbreeding coefficient, F, relative to the first 
generation, in which there are no homozygotes. Then the amount of heterosis relative 
to that of the single crosses is H t = 1 — F t . 


Generation 

(0 

Two lines 

Three lines 

F, 

n, 

F, 

n, 

1 

0 

1 

0 

1 

2 

1/2 

0.5 

0 

1 

3 

1/4 

0.75 

1/4 

0.75 

4 

3/8 

0.625 

1/8 

0.875 

5 

(5/16 

0.6875) 

2/16 

0.875 

Many 

1/3 

0.667 

1/7 

0.857 


If the series of generations were continued the heterosis would be found to settle 
down after a few cycles to 2/3 in the case of two lines and 6/7 in the case of three 
lines. The general formula for the final level of inbreeding relative to the first cross 
is 1 /( 2 " — 1 ), where n is the number of lines. 

127 (13.6) The index required is for mother and mean of half sisters. The index 
equations are 

b 2 P22 + ^ 3^23 = ^21 
^2^32 + bjP 33 = A 3i 

Subscript 1 refers to the individual to be selected, which is not measured; 2 refers 
to the mother and 3 refers to the mean of the half sisters. p 22 - a 2 (i.e. the 
phenotypic variance); jP 2 3 = P 32 = 0 ; P 33 = Ka 1 , where K = [1 + (n - 1 )t]/n, 
as before. A 2 \ is the additive covariance of the individual with its mother = r 2 a 2 A 
= r 2 h 2 o 2 , where r 2 = y. A 31 is the additive covariance of the individual with the 
mean of its half sisters in which the individual is not included; this is the covariance 
relevant to sib selection, and this is not affected by the number of half sibs, for the 
reason explained in the paragraph following equation [9.2]. Thus v4 31 = r 3 /i 2 a 2 , 
where r 3 = 1. Substituting these values in the index equation gives 

b 2 a 2 +0 = r 2 h 2 o 2 

0 + b 3 Ko 2 = r 3 h 2 a 2 
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from which 

b 2 = r 2 h 2 = 2 h 2 
b 3 = r 3 h 2 /K = i h 2 !K 

Now evaluate K. Because there is no environmental resemblance between half sisters, 
t — i h 2 = 0.0875. With n = 10 this gives 

^ 1 + (9 x 0.0875) 

K = ——i-- = 0.17875 

10 

Substituting for K and for h 2 = 0.35 gives the unsealed index 
/ = 0.175F 2 + 0.490P 3 

128 (10.9) The equations in Table 10.6 provide the solutions. Because the choice 
of mates is purely phenotypic, m = rh 2 . Then equation (15) is t = tfi 2 ( 1 + rh 2 ), 
where h 2 is the heritability in the population mating assortatively. Rearranging gives 
2 rh 4 + \ h 2 — t — 0, which has the solution 

h 2 = [-£ ± V(| + 2 rt)]/r 

Substituting the values of r and t gives only one possible solution, which is h 2 = 
0.50. The heritability in the same population if it mated at random, hi, is given 
by equation (9). Substituting h 2 = 0.5 and m = rh 2 = 0.2 gives 

h 2 0 = 0.5 (0.8/0.9) = 0.44 

129 (14.7) It follows from [14.10] that the difference between the F t and the F 2 
is half the heterosis, i.e. F, — F 2 = kH, from which H = —0.24. But the dif¬ 
ference between the Fj and F 2 was very small and non-significant, so this small 
amount of heterosis deduced is not significantly different from zero. In fact the means 
of the parental varieties were 17.88 ± 0.24 and 15.00 ± 0.19, making P = 16.44 
± 0.20, and the observed heterosis was H = —0.72 ± 0.28. 

130 (19.10) The index equations from [19.15] are 

b x P u + b 2 P\ 2 = <3|4n -I- a 2 A n 

b\Pi\ + b 2 P 22 = aiA 21 + a 2 A 22 

The left-hand sides, being phenotypic parameters, are the same as in Problem 19.7. 
For the right-hand sides we need A n = 0.6407 and A, 2 = A 2I = 0.8914, both also 
from Problem 19.7. /f 22 = 2.4602 as calculated in Problem 19.9. The economic 
values are already in the 100 g units used for the other parameters in the equations, 
so a x = 8 and a 2 — —2. The index equations with the values entered are 

1.2321 by + 2.2848 b 2 = (8 x 0.6407) + (-2 X 0.8914) = 3.3428 

2.2848 by + 6.1504 b 2 = (8 X 0.8914) + (-2 x 2.4602) = 2.2108 

Eliminating b 2 as in Problem 19.7 gives 

2.3576 by = 20.5596 - 5.0512 
by = 6.578 
b 2 = -2.084 
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The index for selection is 
/ = 6.578G - 2.084F 
or, more conveniently, 

/' = G - 0.317F 

131 (11.9) We have first to get the coefficient of selection, s, acting on the gene 
by [11.8]. Substituting 2 a = 0.3, <j p = 2.0 and i = 1.755 gives s = 0.2633. The 
gene frequency after one generation of selection is then given by line (1) of Table 
2.2. The gene frequency q in the formula is that of the allele selected against, so 
we must put q = 0.6. The formula can be rewritten in a form that makes the substitu¬ 
tions simpler: 

q\ = qW ~ k(l + q)V{ 1 - sq) 

= 0.6[1 - 0.1316(1.6)]/(l - 0.1580) 

= 0.5625 

For the second generation, put q x in place of q, giving 
q 2 = 0.5245 

The frequency of the increasing allele will therefore be 0.4755. 


132 (2.13) Initial gene frequency of red, q 0 = V(0.01) = 0.10. (1) By [2.9], 
q 2 = 0.1/1.2 = 0.0833. Frequency of red calves = q% = 0.0069. (2) This is tricky 
because more selection is applied to bulls than to cows. Consequently, after selec¬ 
tion the gene frequency is not the same in the male and female parents and the 
genotypes in the progeny are not in Hardy—Weinberg frequencies. The change of 
gene frequency has to be worked out separately for each sex. The frequency of 
heterozygotes before selection is 0.18. The proportion of heterozygous bulls that 
will escape detection by having no red progeny in the test is (f) 6 = 0.178. The 
proportion of all bulls that are undetected heterozygotes is therefore 0.18 x 0.178 
= 0.0320. 


First 

Bulls 




Cows 










generation 

Before 

RR 

Rr 

rr 

Total 

RR 

Rr 

rr Total 

selection 

0.81 

0.18 

0.01 

1.00 

0.81 

0.18 

0.01 1.00 

After 

0.81 

0.0320 

0 

0.8420 

0.81 

0.18 

0 0.99 

selection 

0.9620 

0.0380 

0 

1.0000 

0.8182 

0.1818 

0 1.0000 


*?lm 2 

x 0.0380 

= 0.0190 

q u = ^ 

x 0.1818 

= 0.0909 


Now make a table of gamete frequencies, like Table 1.2, and get the genotype 
frequencies in the progeny. 
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Male gametes 




r 

0.0190 

Female 

R 0.9091 


ilWfi 

gametes 

r 0.0909 

0.0892 



Second 

generation 

Bulls 




Cows 




RR 

Rr 

rr 

Total 

RR 

Rr 

rr 

Total 

Before 

selection 

0.8918 

0.1065 

0.0017 

1.0000 

0.8918 

0.1065 

0.0017 

1.0000 

After 

0.8918 

0.0190 

0 

0.9108 

0.8918 

0.1065 

0 

0.9983 

selection 

0.9792 

0.0208 

0 

1.0000 

0.8933 

0.1067 

0 

1.0000 


q 2m = 0.0104 q 2f = 0.05335 


The frequency of red calves in the progeny is qjm^if = 0.06 per cent. 

(Note: qcan be got more easily by [2.9], but q 2 f cannot be got this way because 
the genotypes before the second selection are not in Hardy—Weinberg frequencies.) 

133 (4.6) The breeding plan was minimal inbreeding with N = 16, so by [4.9], 
N e = 31. 

The data led to the inbreeding coefficient at generation 27 being Fit = 27) = 
0.447. By [3.12] 

(1 - A F) 21 = 1 - 0.447 = 0.553 

27 log (1 - AF) = -0.2573 

1 - A F = 0.9783 

AF = 0.0217 

N e = 1/(2 AF) = 23.0 

134 (18.5) Here x is the deviation of the threshold from the population mean. 
The values of x and i needed are as follows. 


Class 

Threshold 

p% 

X 

i 

N 

T x 

20 

-0.842 

-1.400 

H + F 

T\ 

80 

-0.842 

+ 1.400 x 20/80 = + 0.350 

F 

t 2 

30 

+0.524 

+ 1.159 


The value of i for p = 80 per cent is the value for p = 20 per cent multiplied by 
(1 — p)!p, as stated at the head of Appendix Table A. 
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(1) Difference between thresholds 

= 0.524 - (-0.842) = 1.366a 
1 t.u. = 1.366a; a = 0.732 t.u. 

(2) Population mean as deviation from lower threshold 

= + 0.842a = (0.842 X 0.732) t.u. = + 0.616 t.u. 

(3) N = -1.400a = -1.025 t.u. 

F = + 1.159a = +0.848 t.u. 

(H + F) = + 0.350a = + 0.256 t.u. 

{H + F) is the mean of all individuals above the lower threshold, made up of 
50/80 H and 30/80 F. Therefore 


(H + F) — 


50 

80 


H + 


30 

80 


0.256 = 0.625 H + (0.375 x 0.848) 

H = -0.099 t.u. 


To check, see that the mean of N, H, and F, each weighted by its frequency, sums 
to zero as the mean deviation from the population mean. 


N: 

-1.025 

X 

0.2 = 

-0.2050 

H: 

-0.099 

X 

0.5 = 

-0.0495 

F\ 

+0.848 

X 

0.3 = 

+0.2544 




£ = 

0.000 


135 (15.7) The working follows exactly that of Example 15.2 and will not be 
explained in detail. The varieties themselves, whose yields are on the diagonal of 
the table, must be excluded because combining ability refers to performance in 
crosses. The values of T and G are on the right of the table below. G is the general 
combining ability of the variety indicated on the left of the table. As examples, 

T B = 14.1 + 16.5 + 6.2 + 12.4 = 49.2 


G b — 


49.2 

3 


302.6 

5x3 


3.77 


The G’s are deviations from the mean; to check, see that £ G = 0. 
The expected value of, for example, the cross A X B is 


X + G a + G b 
= 15.13 + 8.36 - 3.77 
= 19.72 


The deviation (SCA + error) is therefore 14.1 — 19.72 = —5.62. To check, see 
that the deviations sum to zero. The deviations are given in the table below. 
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B 

C 

D 

E 

T 

G 

A 

-5.62 

+0.55 

-1.25 

+6.32 

85.6 

+ 8.36 

B 

— 

+6.38 

+0.18 

-0.95 

49.2 

-3.77 

C 

— 

— 

-0.25 

-6.68 

56.8 

-1.24 

D 

_ 

— 

— 

+ 1.32 

44.5 

-5.34 

E 

— 

— 

— 

— 

66.5 

+ 1.99 

E 

__ 

_ 

_ 

_ 

302.6 

0.00 

X = 

302.6/20 

= 15.13 






136 (16.5) To avoid confusion, designate the breeds by letters to correspond with 
those used in Problem 16.3, as shown below. In calculating the expectations for 
each generation, remember that the formula for any generation applies to W in that 
generation and to N in the next. The required total litter weight is W X N. 


Breed or cross 

Y 

C 

D 

YC 

YD 

CD 

Designation 

A 

B 

C 

AB 

AC 

BC 

W 

84 

78 

88 

90 

96 

92 

N 

7.9 

6.6 

6.3 

8.2 

7.3 

7.4 


Generation 

Genotypes 

W 

N 

- 

Expectations 

W N 

W x A 

1 

AB 

A 


90 

7.9 

711 

2 

Xj X C 

AB 


94 

8.2 

771 

3 

X 2 X A 

X, x 

C 

91.5 

7.35 

673 

4 

X 3 x B 

X 2 x 

A 

89 

7.675 

683 

5 

X 4 X c 

X 3 x 

B 

92.75 

7.8 

723 


Note that the crossbred performance fluctuates and none is better than the three- 
way cross in generation 2. 

137 (13.7) The two intensities of selection cannot be combined because the stan¬ 
dard deviation of the index is not the same as that of yield. The response has to 
be got from the predicted breeding values of males and of females. First, from [13.12] 

aj = & 2^21 + ^ 3^31 

= b 2 r 2 h 2 a 2 + ib 3 r 3 h 2 a 2 
= (b 2 r 2 + b y r 3 )h 2 o 2 

= [(0.175 X 0.5) + (0.490 X 0.25)] X 0.35 X a 2 
= 0.0735 u 2 
oj = 0.2711 x 696 
= 188.69 

Selecting 5 per cent of bulls gives i = 2.063, from Appendix Table A. The expected 
breeding value of the selected bulls is given by [13.15]: 
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Bulls’ breeding value = ioj = 2.063 X 188.69 = 389.3 

Selecting 50 per cent of cows gives i = 0.798, and the predicted breeding value 
is given by [11.3] as: 

Cows’ breeding value = ih 2 a P = 0.798 X 0.35 x 696 = 194.4 

Both breeding values are deviations from the population mean and the expected 
response is the mean of the two: 

R = ^(389.3 + 194.4) 

= 292 kg 

138 (10.10) In all cases we need to know both N, the number of families, and 
n, the number of offspring or of sibs per family. The one of these that is not given 
has to be deduced from T = 400. The relationships of N and n to T are as follows. 


(1) 

T = N(n + 1). 

N = 200, 

n = 1. 

(2) 

T = N(n + 2). 

n = 3, 

N = 80. 

(3) 

T = Nn. 

n — 5, 

N = 80. 

(4) 

T = Nn. 

N = 20, 

n = 20. 


In the case of (1) and (2) the number of individuals measured per family is the number 
of offspring, n, plus one parent in (1) and two parents in (2). 

Equation [10.8] gives the sampling variance of the regression in (1) and (2), and 
[10.12] gives the sampling variance of the correlation in (3) and (4). In cases (1) 
and (4), however, the designs are optimal and there is a shorter way of getting the 
standard error of the heritability. 

(1) The design is optimal because n — 1. With one parent measured (k — 1), [10.8] 
reduces to the s.e. of h 2 given in [10.9]. 

s.e.(fc 2 ) = 2/V200 = 0.14 

(2) k = 2; t = ]/i 2 = 0.3 (neglecting dominance); N = 80; n = 3. 




2[1 + (2 X 0.3)] 
3 x 80 


= 0.0133 


a b = 0.115 

The heritability is estimated by b, so 
s .e.(h 2 ) = o b — 0.115 

(Note: The approximation here is not very good. The exact formula (the equation 
preceding [10.6]) gives o 2 b = 0.0091; s.e.(/i 2 ) = 0.095.) 

(3) t = 5/1 2 = 0.2 (neglecting dominance); N = 80; n = 5. 

, 2[1 + (4 X 0.2)] 2 (0.8) 2 

of = 


o t = 0.051 


5 x 4 x 79 


= 0.00262 
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A 2 is estimated as 2t, so 
s.e.(h 2 ) = 2 a t = 0.10 
(4) N = 20\n = 20. 

This is the optimal design for half sibs because n = 4/h 2 , and the sampling 
variance of the heritability is given approximately by [10.15]. 

a 2 h 2 = (32 x 0.2)/400 = 0.0160 
s.e.(/i 2 ) = 0.13 

The exact formula, [10.12], gives s.e.(/i 2 ) = 0.12. 

139 (20.3) All that is needed before plotting the graph is to convert mortality to 
survival and, perhaps, to express birth weights as deviations from the mean in standard 
deviation units. For example 


Birth weight (kg) 

Deviation in o 

Survival (%) 

1.3 

-4.24 

38.8 

1.8 

-3.25 

66.7 

etc. 




Note the flat-topped nature of the curve: 97 per cent of babies have survival prob¬ 
abilities within the narrow range of 97 to 99 per cent. The data give no grounds 
for believing that small babies die because they are small; they may be small because 
they have some other disability from which they die. 

140 (19.11) The calculation follows that of Problem 19.8. The variance of the 
index, by [9.18], is 

aj = (6.578 X 3.3428) + (-2.084 X 2.2108) = 17.3816 
oj = 4.169 

The intensity of selection was found in Problem 19.3 to be i = 1.5775. Therefore 
the expected response to selection for the index is 

Rj = 1.5775 x 4.169 
= 6.58 

The predicted improvement in economic value is 6.58 cents per bird per generation. 

When selection is made for growth alone, growth will increase as a direct response, 
giving an economic gain, and food consumption will increase as a correlated response, 
giving an economic loss. The responses were calculated in Problem 19.3. Converted 
to 100 g units the direct response of growth is +0.911 and the correlated response 
of food consumption is +1.268. The economic gain is therefore 

From growth: 0.911 X 8 = 7.29 cents 

From food consumption: 1.268 X (-2) = -2.54 cents 

Net economic gain 4.75 cents 


The relative effectiveness of the index for improving economic value is thus 6.58/4.75 
= 1.39. The index would be 39 per cent better. 
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Adaptive value, 26 
Additive 

combination of loci, 114 
gene action, 121 
variance, 126, 128 
Assortative mating, 21, 176, 182 
Asymmetry of selection response, 213, 
295 

Average effect, 115 

Backcrosses, see Crosses 
Base population, see Population 
BLUP, 245 
Breeding value, 118 

Canalization, 344 
Carriers, frequency of, 11 
Cats, 17 
Cattle 

genetic correlations, 315 
heritabilities, 165 
inbreeding, 249 
repeatabilities, 140, 145 
Clines, 46 
Coadaptation, 260 
Coancestry, 88, 142, 274 
Coefficient 
of consanguinity, 88 
of inbreeding, 62, 85 
of kinship, 88 
of relationship, 156 
of selection, 27, 201 
Combining ability 
estimation of, 276 
general, 275 
selection for, 284 
specific, 275 
Competition, 160 
Competitive index, 338 


Components 
of fitness, 336 
of variance, see Variance 
Consanguinity, coefficient of, 88 
Continuous variation, 104 
Controls, 197 
Corn, see Maize 
Correlation 

between characters, 313, 337; see also 
Correlation, genetic 
genetic, 314 
changes of, 319, 331 
estimation of, 316, 318 
examples of, 315 
and G x E interaction, 322 
and repeatability, 316 
between genotype and environment, 

134 

between mates, 176 
between relatives, 148—60; see also 
Covariance 
environmental, 158 
genetic, 149 
phenotypic, 160 
‘theoretical’, 156, 166 
between repeated measurements, 139 
intraclass, 148 
Covariance, 149 
of gene effects, 133 
of relatives 
full sibs, 154 
general, 155 
half sibs, 154 
offspring—parent, 150—3 
twins, 155, 173 
with assortative mating, 177 
with epistasis, 157 
Crossbreeding in animals, 286 
Crosses 
diallel, 277 



434 


Index 


heterosis in 
backcrosses, 285 
F,, F 2 , 255 
wide crosses, 260 
3-way, 4-way, 285 
inbreeding coefficient after 
backcrosses, 94 
2-way, 4-way, 95 
uniformity, 268, 283 
variance 
genetic, 273 
environmental, 268 

Degree of genetic determination, 126 
Developmental variation, 138, 269 
Deviation 
dominance, 119 
environmental, 111 
interaction (epistatic), 122 
Diallels, 277 

Discontinuous variation, 300 
Dispersive process, 51 
Dogs, 218 
Dominance 
cause of, 350 
degree of, 27, 112 
deviation, 119 
directional, 251 
overdominance, 27; see also 
Overdominance 
variance, 128 
Drift, see Random drift 
Drosophila melanogaster 
abdominal bristles 
bobbed locus, 350 
components of variance, 127, 132 
fitness relationship, 339, 341 
frequency distributions, 107, 222 
heritability, 165 
mutation, 224, 347 
number of loci, 217 
random drift, 266 
repeatability, 140—4 
responses to selection, 190, 193, 
217, 224-6 

bar-eye facet number, 107 
effective population size, 73—4 
egg number, 132, 165 
fitness, 338, 341 
genetic assimilation, 310—11 
genetic correlations, 315, 318 
inbreeding depression, 248 
lethal genes, 32, 226, 349 
ovary size, 132, 142, 165 
random drift, 55, 59, 61, 81 
sternopleural bristles, 218, 224, 227, 
343, 347, 350 


thorax length, 127, 132, 217, 219 
transposable elements, 351 
wing length, 167, 269, 318 
Drosophila subobscura, 252 

Effective factors, 226 
Effective population size, 70 
ratio of, to actual, 73, 75 
Environment, see also Variance 
common, 158 
general, 140 
heterogeneous, 45 
selection in different, 322 
special, 140 

Environmental sensitivity, 136 
and selection, 324 
Epistasis, see Interaction (epistatic) 
Equilibrium 
Hardy—Weinberg, 7 
with assortative mating, 177 
with mutation and selection, 35 
with natural selection, 338 
with selection for heterozygotes, 39 
Eugenics, 34 

Family size 

and heritability estimates, 179 
and inbreeding, 72 
and selection, 230 
Fisher’s fundamental theorem, 346 
Fitness, 26, 336 
components of, 336, 341 
of genotypes, 27 
‘profiles’, 339 
relative, 27, 336 
Fixation, 57 
by close inbreeding, 93 
with selection, 220 
Founder effect, 83 

Gametes 

from inbred lines, 96, 273 
random union, 8 

Gametic phase disequilibrium, 18 
and genotype frequencies, 18, 133 
by selection, 196, 343 
and overdominance, 42 
Gene frequency, 5 
change of, 6, 24, 201 
distributions of, 56, 79, 81 
variance of, 54, 65, 75 
Gene substitution 
average effect of, 116 
by neutral mutation, 77 
Generation interval, 194 
Generations 

number required, 33, 212 
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overlapping, 74, 195 
Genetic assimilation, 310 
Genetic deaths, 35 
Genotype-environment 
correlation, 134 
interaction, 135, 322 
Genotype frequencies, 4 
with inbreeding, 58, 65 
with random mating, 7, 10 
within lines, 67 
Genotypic value, 111, 121 

Hardy—Weinberg law, 7 
test of, 11, 40 
uses of, 11 

Heritability (‘broad sense’), 126, 174 
Heritability (‘narrow sense’), 126, 163 
with assortative mating, 177 
estimation 

from human data, 173 
from relatives, 166 
from selection, 200, 209—12 
precision, 179 
examples of, 165 
of family means, 233—5 
after inbreeding, 267 
realized, 200 
sampling variance of, 211 
of threshold characters, 301 
of within family deviations, 233—5 
Heterosis, 254; see also Crosses 
causes of, 255 
examples of, 258 
uses of, 282 
Heterozygosity, 47 
Heterozygote advantage, see 
Overdominance 
Heterozygotes, frequency of 
with inbreeding, 60, 65 
with random mating, 10—11 
after selection, 41 
Homeostasis 
developmental, 269 
genetic, 339 
Hybrids, see Crosses 
Hybrid vigour, see Heterosis 

Idealized population, 52 
Identical by descent (alleles), 62 
Inbred strains, 253, 270 
experimental use, 270 
sub-line differentiation, 272 
uniformity, 270 
Inbreeding 

coefficient of, 62, 86 
with close inbreeding, 91 


from heterozygosity, 66 
in pedigrees, 85 
close (regular systems), 91 
computations exemplified, 75 
depression, 248 
examples of, 249 
in selection responses, 213 
minimal, 74 

mixed with crossing, 96 
rate of, 64, 70, 91 
variances affected 
genetic, 265 
environmental, 268 
Index for selection 
with values of relatives, 240 
with correlated characters, 325 
Intensity of selection see Selection 
Interaction 

between loci (epistatic), 121 
and covariances of relatives, 157 
and heterosis, 259, 286 
and inbreeding, 251, 258 
and scale, 295 
deviation, 121 

genotype X environment, 135 
and selection, 322 
Island model, 80 
Isolation by distance, 80 

Liability, 300 
heritability of, 302 
relation to incidence, table of, 354 
selection for, 308 
Limits, see under Selection 
Line, 52 
Linkage 

and backcrossing, 94 
and correlation 
between characters, 313 
between relatives, 158 
disequilibrium, see Gametic phase 
disequilibrium 
and effective factors, 226 
and overdominance, 43 
and stabilizing selection, 344 
Load, 35, 37, 39 

Logarithmic transformation, 107, 292—5 
Maize 

combining ability, 277, 284 
diallel cross, 277 
inbred vs hybrid variability, 269 
inbreeding depression, 249, 252 
responses to selection, 218, 224, 285 
wide crosses, 260—1 
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Man 

albinism, 34 
birth weight, 138 
blood groups, 4, 5, 12, 15 
diseases, 304 
dwarfism, 36 
finger ridges, 175, 348 
immunoglobulins, 165, 315 
IQ, 134, 175, 249, 347 
PKU, 11 

sickle-cell anaemia, 41 
stature (height), 165, 175, 249 
twins, 173 

Maternal effects, 137 
and correlations of relatives, 159 
and crossing, 259 
and inbreeding, 251, 253 
and selection response, 214 
Mating 

assortative, 21, 176, 182 
frequencies of types, 14 
non-random, 21 
random, 7 

Merit, in index selection, 326, 327 
Metric character, 104 
Migration, 24, 78 
Mouse 
body weight 
and Fitness, 199, 342—3 
heritability, 165, 168 
inbreeding depression, 249 
maternal effect, 159, 172, 214 
number of loci, 217 
repeatability, 140 
responses to selection, 197, 200, 
210, 217, 219, 321, 329 
variance and scale, 293 
fitness components, 337 
generation interval, 195 
genetic correlations, 315, 324 
growth, 107, 324 
inbred vs hybrid variability, 269 
inbreeding calculations, 75 
litter size 
distribution, 107 
heritability, 165 
heterosis, 255 

inbreeding depression, 249, 252 
number of loci, 217, 227 
repeatability, 140 
response to selection, 217 
nursing ability, 321 
pigment granules, 114, 122 
pygmy gene 
under selection, 225 
as test of scale, 296 


values (weight), 112, 114, 117, 296 
variances, 130 
sub-line divergence, 272 
tail length, 159, 165, 315, 329 
vertebra number, 307 
Multiple alleles, 14, 118 
Multiple measurements, see 
Repeatability 
Mutation 

balanced by selection, 35 

change of gene frequency, 24 

and inbreeding, 76, 99 

neutral, 46, 77, 272 

non-recurrent, 25, 77 

and origin of variation, 232, 270, 347 

rate of, 25 

recurrent, 25, 77 

Neighbourhood model, 80 
Nicotiana, 136, 259 
Non-additive 

combination of genes, 122 
variance, 132 

Overdominance, 27 
causes of, 42 
and gene frequency, 39 
and heterosis, 288 
and inbreeding, 100, 254 
marginal, 42 

and natural selection, 341, 344 
and polymorphism, 44 
and selection limits, 224 

Panmictic index, 64 
Panmixia, 7 

Pedigrees and inbreeding, 86 
Pigs, 165, 171, 249, 304, 315 
Pleiotropy, 42, 313, 342 
Polycross, 276 
Polygenes, polygenic, 105 
Polymorphism, 44 
Population 
base-, 52, 63, 98 
effective size, 70 
-mean, 113 
pedigreed, 85 
-size, 6, 53 
structured, 98 
subdivided, 51—2 
synthetic, 285 
Poultry, 165, 288, 315 
Prediction 

of cross performance, 275, 277—8, 
285-6 

of future performance, 144 
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of response to selection, 190, 233, 328 
Premisses, 2 
Progeny test, 232 

Quantitative character, 104 
Quantitative variation, genes causing 
349 

Random drift, 51 
in natural populations, 82 
in selection responses, 210-13 
Random mating, 7 
Range, 217 
Rats, 43, 269, 295 
Recombinant lines, 289 
Recombination, 18 
Regression, offspring on parents 
149-53 

Relatives, resemblance of, see 
Correlation 
REML, 173, 245 
Repeatability, 139 
clonal, 128 
examples of, 140 
Response, see Selection 

Scale 

-effects, 291 

transformations, 106-7, 291, 305 
underlying, 300 
Selection 

accuracy of, 244, 320 
balancing, 44 

change of gene frequency by, 28 39 
201 

coefficient of, 27, 201 
combined, 236 
differential, 188 
correlated, 319, 347 
effective (realized), 199, 225 
prediction of, 191 
weighted, 198, 346 
disruptive, 345 
divergent, 198 
effects on 

genetic correlations, 331 
heritability estimation, 183 
inbreeding depression, 253 
random drift, 80 
variance, 203, 222, 293, 344 
family, 231 

frequency-dependent, 45 
index, 240, 325 
indirect, 320 
individual, 231 
intensity of, 192 


relation to coefficient of, 202 
tables of, 354—5 
limits, 217 
causes of, 224 
theory of, 220 
mass, 231 

multiple-trait, 326, 327 
natural, 199, 225, 336, 345 
reciprocal-recurrent, 287 
response, 188 
asymmetry, 213, 295 
correlated, 318, 346 
duration, 217 
measurement, 196 
to natural selection, 346, 
precision, 196 
prediction, 190, 233 
repeatability, 209 
total, 217 
sib, 232, 235 
stabilizing, 341, 343 
tandem, 326 
truncation, 191 
within-family, 233 
Selective value, 26 
Self-fertilization, 91 
Self-fertilization plants, 289 
Sex-linked genes, 16, 92 
Sheep, 140, 249, 287 
Sib-analysis, 169 

Standardized effect, 203, 217, 226 
Systematic processes, 24 
in small populations, 76 

Threshold, 300 
characters, 300 
heritability of, 302 
selection for, 308 
-unit, 306 

Tobacco ( Nicotiana ), 136, 259 
Tomatoes, 259 
Top-cross, 277 
Transformation of scale, 291 
Tribolium, 55, 224, 345 
True mean, 148 
Twins, 173 

Uniformity 
in inbred lines, 270 
in hybrids, 269 

Value 

breeding, 118 
economic, 327 
genotypic 111 
phenotypic. 111 



438 


Index 


Variance 
components, 125 
additive, 128 
causal, 148, 170 
environmental, 136, 158 
epistatic (interaction), 132 
disequilibrium, 133 
dominance, 128 
genotypic, 126, 129 
observational, 148, 169, 234 
of family size, 72—4 
of gene frequency, 54, 75 


of observed family-means, 234—5 
partitioning, 125 
methods summarized, 145 
in crosses, 273 
with inbreeding, 265 

Weighting 

in heritability estimation, 182 
selection differential, 198,346 

Wright’s F-statistics, 99 

Zygotes, 8 



